Python Parser Tutorial

Introduction

Using python as a parser, the IKE will have the power to:

  • Write parser scripts in a well documented, popular language
  • Easily debug the scripts
  • Write unit tests

Prepare your environment 

1. Install python 3.7

Use any tutorial that matches your OS, for example https://realpython.com/installing-python/

make sure the installation is a success by running on the terminal 

python3 --version

The result should be

Python 3.7.4

2. Install indeni-parser

Using the terminal, install the latest version of indeni-parser

pip3 install "indeni_parser==0.0.0.*" --extra-index-url https://indeni.jfrog.io/indeni/api/pypi/indeni-pypi-develop/simple

The result should be similar to:

Looking in indexes: https://service.indeni-ops.com/repo/api/pypi/indeni-pypi/simple
Collecting indeni-parser
  Downloading https://service.indeni-ops.com/repo/api/pypi/indeni-pypi/indeni-parser/0.0.0.1667/indeni_parser-0.0.0.1667-py3-none-any.whl
Installing collected packages: indeni-parser
Successfully installed indeni-parser-0.0.0.1667

in order to uninstall indeni-parser use 

pip3 uninstall indeni-parser

3. Install PyCharm

Install the community edition of PyCharm

https://www.jetbrains.com/pycharm/download

4. Define Project Interpreter

  • Open in PyCharm the relevant project (for example indeni-knowledge)
  • On the top menu go to Pycharm → Preferences...
  • On the left pane go to Project → Project Interpreter
  • The Project Interpreter should be: 

    /usr/local/bin/python3.7

    Continue for next step in a case that the Project Interpreter is not as mentioned or it's one of:

    /usr/local/bin/python2 - using python2
    /usr/local/bin/python - using default python
    /document/venv/bin/python - using virtual env
  • Open the Project Interpreter window and create a new one. 
  • While creating a new Project Interpreter choose Existing environment and not a new one. 
  • Link to the python folder For example 

    /usr/local/bin/python3.7/

5. Add indeni-parser template

  • Download the file indeni_parser_template.zip
  • On Pycharm top menu go to File → Import settings and choose the downloaded file
  • Install the file template

Write your first parser 

Code Conventions

  • Comply with PEP8. It should be enabled by default in Pycharm; for VS Code: https://code.visualstudio.com/docs/python/linting
    • Exceptions: you can disable PEP8 E501: line too long. PyCharm: settings > inspections > python > ignore errors. Click '+' and add E501 to the list.
      • For vscode, using pycodestyle as linter you can add "python.linting.pycodestyleArgs": ["ignore=E501"] to your settings .json file
  • Use single quotes, not double quotes
  • Class names in PascalCase: class LogServerConnectionNoVsx
  • File names in snake_case:  log_server_connection_novsx.py
    • No "parser" or ".1" in file name (awk naming convention is: fwaccel-stat-novsx.parser.1.awk – don't do this (smile))
    • Multi-step scripts are an exception to this: you will want to number your multi-step scripts. See e.g., parsers/src/checkpoint/management/cpmiquery-check-SIC-mds
  • TextFSM templates should have the same base name as the .ind.yaml and .py files, and should have the .textfsm file extension.
    • log_server_connection_novsx.ind.yaml
    • log_server_connection_novsx.py
    • log_server_connection_novsx.textfsm

0. Get Device input file

  • In order to write the parser while debugging, we need some kind of input file. 
  • The best case is to copy the input directly from the device or from the ind document

Copy the input file to a new text file and place it  under the project, for example:

input_0_0:

{
  "entries": [
    {
      "id": "cpu-1",
      "total": "2000",
      "used": "300"
    },
    {
      "id": "cpu-2",
      "total": "2000",
      "used": "1000"
    },
    {
      "id": "cpu-3",
      "total": "2000",
      "used": "1500"
    },
    {
      "id": "cpu-4",
      "total": "2000",
      "used": "10"
    }
  ]
}

1.Create a parser file

On menu go to File → New.. → IndeniParser

Give names to the file and class.

The file name cannot contain dots (.) use -/_ instead (e.g. mpstat_parser.py)

The class name should be UpperCamelCase for example MpstatParser.

2. Data Extraction

The first thing we need to do is to take the raw data and convert it to python object.

There are several ways to do the conversion:

  • helper_methods.parse_data_as_json
    • In case that the data is pure json, one can simply call this method.
    • The method will return the dict object with all the values.
    • All values are considered as a string. before use, you will have to convert them.  For example, float(result['entry'])
  • helper_methods.parse_data_as_xml
    • In case that the data is pure XML, one can simply call this method.
    • The method will return the dict object with all the values.
    • All values are considered as a string. Before use, one will have to convert them, for example, float(result['entry'])
    • Warning: when parsing an XML list with one value, we don't have a way to know this is actually a list, and therefore the parser returns the result as value and not a list.
      In order to avoid exceptions, you should always wrap list values from XML with the method to_list, for example,

      helper_methods.to_list(result['entries'])
  • helper_methods.parse_data_as_list
  • helper_methods.parse_data_as_object
    • Using google textfsm and regex we can parse raw data to a single dict.
    • This method is basically the same as parse_data_as_list  but also does validation that we got only one result
  • Other: 
    • The IKE can take the raw data and parse it as he like

As an example, we will simply use the json parser:

data = helper_methods.parse_data_as_json(raw_data)

3. Data Processing

After getting the data, we can process it to get the values that we need.

This section will normally be just a pure python, with the use of common methods that under the library helper_methods.

In our example we would like to take the data and create a map between cpu-id and percentage used:

for entry in entries:
    cpu_id = entry["id"]
    cpu_used = float(entry["used"])
    cpu_total = float(entry["total"])
    cpu_used_in_percentage = cpu_used / cpu_total * 100
    cpus[cpu_id] = cpu_used_in_percentage

4. Data Reporting

Finally, we need to report the data.

We have several types of data to report:

  • double metric - reported by write_double_metric
  • complex metric -  reported by write_complex_metric_string or write_complex_metric_object_array
  • device tag - reported by write_tag
  • dynamic variable - reported by write_dynamic_variable

By hovering the method or click it you will get info what data to enter.

In our example, we want to report double metric for each cpu

for cpu_id in cpus:
    tags = {"name" : cpu_id}
	self.write_double_metric("cpu-usage",tags, "gauge", cpus[cpu_id], False)

5. Running the parser

The parser file should look like this:

from parser_service.public import helper_methods
from parser_service.public.base_parser import BaseParser


class CpuParser(BaseParser):
    def parse(self, raw_data: str, dynamic_var: dict, device_tags: dict):):
        self.debug("start parser")

        # Step 1 : Data Extraction
        data = helper_methods.parse_data_as_json(raw_data)

        # Step 2 : Data Processing
        entries = data['entries']
        cpus = {}
        for entry in entries:
            cpu_id = entry["id"]
            cpu_used = float(entry["used"])
            cpu_total = float(entry["total"])
            cpu_used_in_percentage = cpu_used / cpu_total * 100
            cpus[cpu_id] = cpu_used_in_percentage

        # Step 3 : Data Reporting
        for cpu_id in cpus:
            tags = {"name" : cpu_id}
            self.write_double_metric("cpu-usage",tags, "gauge", cpus[cpu_id], False)

        return self.output

# Test your code Here
# helper_methods.print_list(CpuParser().parse_file("input.json",{}))

(You can see that we added Debug message which can help the IKE find issues)

Uncomment the last line and run the class, fix the input name to the name you created and run the class, the result should be:

{'action_type': 'Debug', 'timestamp': 1566803012214, 'message': 'start parser'}
{'action_type': 'WriteDoubleMetric', 'timestamp': 1566803012214, 'tags': {'name': 'cpu-1'}, 'value': 15.0, 'name': 'cpu-usage', 'ds_type': 'gauge'}
{'action_type': 'WriteDoubleMetric', 'timestamp': 1566803012214, 'tags': {'name': 'cpu-2'}, 'value': 50.0, 'name': 'cpu-usage', 'ds_type': 'gauge'}
{'action_type': 'WriteDoubleMetric', 'timestamp': 1566803012214, 'tags': {'name': 'cpu-3'}, 'value': 75.0, 'name': 'cpu-usage', 'ds_type': 'gauge'}
{'action_type': 'WriteDoubleMetric', 'timestamp': 1566803012214, 'tags': {'name': 'cpu-4'}, 'value': 0.5, 'name': 'cpu-usage', 'ds_type': 'gauge'}

Using the python parser in the Command-runner/Collector

command-runner working with Python in is the same way as with AWK.
See Command Runner.

The only thing needed to do, in order to use the python parser file over indeni platform, is to link the ind script to the parser and choose the right type:

-   run:
      type: SNMP
      command: GETBULK --columns 1.3.6.1.4.1.9.9.48.1.1.1.2 1.3.6.1.4.1.9.9.48.1.1.1.5 1.3.6.1.4.1.9.9.48.1.1.1.6
    parse:
      type: PYTHON
      file: memory_usage_parser.py

As long as you installed the indeni-parser package the command-runner should work

Unit testing

Files architecture

The test files for each parser should be located in a test folder, next to the parser file.

For example: 

indeni-knowledge/parsers/src/panw/panos/panos_anti_spyware_info_low_severity/test

The test file should start with the prefix test_ for example:

indeni-knowledge/parsers/src/panw/panos/panos_anti_spyware_info_low_severity/test/test_panos_anti_spyware_info_low_severity_parser_1.py

In order for python to recognize the path of the packages, the path should not contain any folder/file name with "-", it can only contain "_".

Running the test locally

while running the test, pycharm needs to know that src folder is a source folder. we can do it by right-clicking:

indeni-knowledge/parsers/src/

and choose Mark directory as Sources root

The folder color should be turned to blue.

How to write a test file

example for test file:

import os
import unittest

from panw.panos.panos_anti_spyware_info_low_severity.panos_anti_spyware_info_low_severity_parser_1 import LowSevParser1
from parser_service.public.action import WriteDynamicVariable


class TestLowSevParser1(unittest.TestCase):

    def setUp(self):
        # Arrange
        self.parser = LowSevParser1()
        self.current_dir = os.path.dirname(os.path.realpath(__file__))

    def test_valid_input(self):
        # Act
        result = self.parser.parse_file(self.current_dir + '/valid_input.xml', {}, {})

        # Assert
        self.assertEqual(2, len(result))

        self.assertTrue(isinstance(result[0], WriteDynamicVariable))
        self.assertEqual('profile', result[0].key)
        self.assertEqual('Test', result[0].value)

        self.assertTrue(isinstance(result[1], WriteDynamicVariable))
        self.assertEqual('profile', result[1].key)
        self.assertEqual('Test-1', result[1].value)

    def test_invalid_input(self):
        # Act
        result = self.parser.parse_file(self.current_dir + '/invalid_input.xml', {}, {})
        # Assert
        self.assertEqual(0, len(result))


if __name__ == '__main__':
    unittest.main()

  • A test file will start with a prefix test_. if the prefix is missing the CI will not run it.
  • A test class will start with a prefix Test for example TestLowSevParser1.
  • A test class will inherit from the class unittest.TestCase.
  • Every test file needs to use an input file that sits next to the test file. in order to get the relative path use:
self.current_dir = os.path.dirname(os.path.realpath(__file__))
  • Every test method will start with the prefix test_.
  • Use the parse_file method to run the parser and assert the result.
  • It's better to first check if the number of the result items is correct and then check each one of them.
  • after running the CI, make sure the tests ran: