Python Parser Tutorial
Introduction
Using python as a parser, the IKE will have the power to:
- Write parser scripts in a well documented, popular language
- Easily debug the scripts
- Write unit tests
Prepare your environment
1. Install python 3.7
Use any tutorial that matches your OS, for example https://realpython.com/installing-python/
make sure the installation is a success by running on the terminal
python3 --version
The result should be
Python 3.7.4
2. Install indeni-parser
Using the terminal, install the latest version of indeni-parser
pip3 install "indeni_parser==0.0.0.*" --extra-index-url https://indeni.jfrog.io/indeni/api/pypi/indeni-pypi-develop/simple
The result should be similar to:
Looking in indexes: https://service.indeni-ops.com/repo/api/pypi/indeni-pypi/simple Collecting indeni-parser Downloading https://service.indeni-ops.com/repo/api/pypi/indeni-pypi/indeni-parser/0.0.0.1667/indeni_parser-0.0.0.1667-py3-none-any.whl Installing collected packages: indeni-parser Successfully installed indeni-parser-0.0.0.1667
in order to uninstall indeni-parser use
pip3 uninstall indeni-parser
3. Install PyCharm
Install the community edition of PyCharm
https://www.jetbrains.com/pycharm/download
4. Define Project Interpreter
- Open in PyCharm the relevant project (for example indeni-knowledge)
- On the top menu go to Pycharm → Preferences...
- On the left pane go to Project → Project Interpreter
The Project Interpreter should be:
/usr/local/bin/python3.7
Continue for next step in a case that the Project Interpreter is not as mentioned or it's one of:
/usr/local/bin/python2 - using python2 /usr/local/bin/python - using default python /document/venv/bin/python - using virtual env
- Open the Project Interpreter window and create a new one.
- While creating a new Project Interpreter choose Existing environment and not a new one.
Link to the python folder For example
/usr/local/bin/python3.7/
5. Add indeni-parser template
- Download the file indeni_parser_template.zip
- On Pycharm top menu go to File → Import settings and choose the downloaded file
- Install the file template
Write your first parser
Code Conventions
- Comply with PEP8. It should be enabled by default in Pycharm; for VS Code: https://code.visualstudio.com/docs/python/linting
- Exceptions: you can disable PEP8 E501: line too long. PyCharm: settings > inspections > python > ignore errors. Click '+' and add E501 to the list.
- For vscode, using pycodestyle as linter you can add "python.linting.pycodestyleArgs": ["ignore=E501"] to your settings .json file
- Exceptions: you can disable PEP8 E501: line too long. PyCharm: settings > inspections > python > ignore errors. Click '+' and add E501 to the list.
- Use single quotes, not double quotes
- Class names in PascalCase: class LogServerConnectionNoVsx
- File names in snake_case: log_server_connection_novsx.py
- No "parser" or ".1" in file name (awk naming convention is: fwaccel-stat-novsx.parser.1.awk – don't do this )
- Multi-step scripts are an exception to this: you will want to number your multi-step scripts. See e.g., parsers/src/checkpoint/management/cpmiquery-check-SIC-mds
- TextFSM templates should have the same base name as the .ind.yaml and .py files, and should have the .textfsm file extension.
- log_server_connection_novsx.ind.yaml
- log_server_connection_novsx.py
- log_server_connection_novsx.textfsm
0. Get Device input file
- In order to write the parser while debugging, we need some kind of input file.
- The best case is to copy the input directly from the device or from the ind document
Copy the input file to a new text file and place it under the project, for example:
input_0_0:
{ "entries": [ { "id": "cpu-1", "total": "2000", "used": "300" }, { "id": "cpu-2", "total": "2000", "used": "1000" }, { "id": "cpu-3", "total": "2000", "used": "1500" }, { "id": "cpu-4", "total": "2000", "used": "10" } ] }
1.Create a parser file
On menu go to File → New.. → IndeniParser
Give names to the file and class.
The file name cannot contain dots (.) use -/_ instead (e.g. mpstat_parser.py)
The class name should be UpperCamelCase for example MpstatParser.
2. Data Extraction
The first thing we need to do is to take the raw data and convert it to python object.
There are several ways to do the conversion:
- helper_methods.parse_data_as_json:
- In case that the data is pure json, one can simply call this method.
- The method will return the dict object with all the values.
- All values are considered as a string. before use, you will have to convert them. For example, float(result['entry'])
- helper_methods.parse_data_as_xml:
- In case that the data is pure XML, one can simply call this method.
- The method will return the dict object with all the values.
- All values are considered as a string. Before use, one will have to convert them, for example, float(result['entry'])
Warning: when parsing an XML list with one value, we don't have a way to know this is actually a list, and therefore the parser returns the result as value and not a list.
In order to avoid exceptions, you should always wrap list values from XML with the method to_list, for example,helper_methods.to_list(result['entries'])
- helper_methods.parse_data_as_list:
- Using google textfsm and regex we can parse raw data to a table.
- Review documentation here https://github.com/google/textfsm/wiki/TextFSM.
- The result is a list of dict, where each line is a map between the column name and value
- helper_methods.parse_data_as_object:
- Using google textfsm and regex we can parse raw data to a single dict.
- This method is basically the same as parse_data_as_list but also does validation that we got only one result
- Other:
- The IKE can take the raw data and parse it as he like
As an example, we will simply use the json parser:
data = helper_methods.parse_data_as_json(raw_data)
3. Data Processing
After getting the data, we can process it to get the values that we need.
This section will normally be just a pure python, with the use of common methods that under the library helper_methods.
In our example we would like to take the data and create a map between cpu-id and percentage used:
for entry in entries: cpu_id = entry["id"] cpu_used = float(entry["used"]) cpu_total = float(entry["total"]) cpu_used_in_percentage = cpu_used / cpu_total * 100 cpus[cpu_id] = cpu_used_in_percentage
4. Data Reporting
Finally, we need to report the data.
We have several types of data to report:
- double metric - reported by write_double_metric
- complex metric - reported by write_complex_metric_string or write_complex_metric_object_array
- device tag - reported by write_tag
- dynamic variable - reported by write_dynamic_variable
By hovering the method or click it you will get info what data to enter.
In our example, we want to report double metric for each cpu
for cpu_id in cpus: tags = {"name" : cpu_id} self.write_double_metric("cpu-usage",tags, "gauge", cpus[cpu_id], False)
5. Running the parser
The parser file should look like this:
from parser_service.public import helper_methods from parser_service.public.base_parser import BaseParser class CpuParser(BaseParser): def parse(self, raw_data: str, dynamic_var: dict, device_tags: dict):): self.debug("start parser") # Step 1 : Data Extraction data = helper_methods.parse_data_as_json(raw_data) # Step 2 : Data Processing entries = data['entries'] cpus = {} for entry in entries: cpu_id = entry["id"] cpu_used = float(entry["used"]) cpu_total = float(entry["total"]) cpu_used_in_percentage = cpu_used / cpu_total * 100 cpus[cpu_id] = cpu_used_in_percentage # Step 3 : Data Reporting for cpu_id in cpus: tags = {"name" : cpu_id} self.write_double_metric("cpu-usage",tags, "gauge", cpus[cpu_id], False) return self.output # Test your code Here # helper_methods.print_list(CpuParser().parse_file("input.json",{}))
(You can see that we added Debug message which can help the IKE find issues)
Uncomment the last line and run the class, fix the input name to the name you created and run the class, the result should be:
{'action_type': 'Debug', 'timestamp': 1566803012214, 'message': 'start parser'} {'action_type': 'WriteDoubleMetric', 'timestamp': 1566803012214, 'tags': {'name': 'cpu-1'}, 'value': 15.0, 'name': 'cpu-usage', 'ds_type': 'gauge'} {'action_type': 'WriteDoubleMetric', 'timestamp': 1566803012214, 'tags': {'name': 'cpu-2'}, 'value': 50.0, 'name': 'cpu-usage', 'ds_type': 'gauge'} {'action_type': 'WriteDoubleMetric', 'timestamp': 1566803012214, 'tags': {'name': 'cpu-3'}, 'value': 75.0, 'name': 'cpu-usage', 'ds_type': 'gauge'} {'action_type': 'WriteDoubleMetric', 'timestamp': 1566803012214, 'tags': {'name': 'cpu-4'}, 'value': 0.5, 'name': 'cpu-usage', 'ds_type': 'gauge'}
Using the python parser in the Command-runner/Collector
command-runner working with Python in is the same way as with AWK.
See Command Runner.
The only thing needed to do, in order to use the python parser file over indeni platform, is to link the ind script to the parser and choose the right type:
- run: type: SNMP command: GETBULK --columns 1.3.6.1.4.1.9.9.48.1.1.1.2 1.3.6.1.4.1.9.9.48.1.1.1.5 1.3.6.1.4.1.9.9.48.1.1.1.6 parse: type: PYTHON file: memory_usage_parser.py
As long as you installed the indeni-parser package the command-runner should work
Unit testing
Files architecture
The test files for each parser should be located in a test folder, next to the parser file.
For example:
indeni-knowledge/parsers/src/panw/panos/panos_anti_spyware_info_low_severity/test
The test file should start with the prefix test_ for example:
indeni-knowledge/parsers/src/panw/panos/panos_anti_spyware_info_low_severity/test/test_panos_anti_spyware_info_low_severity_parser_1.py
In order for python to recognize the path of the packages, the path should not contain any folder/file name with "-", it can only contain "_".
Running the test locally
while running the test, pycharm needs to know that src folder is a source folder. we can do it by right-clicking:
indeni-knowledge/parsers/src/
and choose Mark directory as Sources root
The folder color should be turned to blue.
How to write a test file
example for test file:
import os import unittest from panw.panos.panos_anti_spyware_info_low_severity.panos_anti_spyware_info_low_severity_parser_1 import LowSevParser1 from parser_service.public.action import WriteDynamicVariable class TestLowSevParser1(unittest.TestCase): def setUp(self): # Arrange self.parser = LowSevParser1() self.current_dir = os.path.dirname(os.path.realpath(__file__)) def test_valid_input(self): # Act result = self.parser.parse_file(self.current_dir + '/valid_input.xml', {}, {}) # Assert self.assertEqual(2, len(result)) self.assertTrue(isinstance(result[0], WriteDynamicVariable)) self.assertEqual('profile', result[0].key) self.assertEqual('Test', result[0].value) self.assertTrue(isinstance(result[1], WriteDynamicVariable)) self.assertEqual('profile', result[1].key) self.assertEqual('Test-1', result[1].value) def test_invalid_input(self): # Act result = self.parser.parse_file(self.current_dir + '/invalid_input.xml', {}, {}) # Assert self.assertEqual(0, len(result)) if __name__ == '__main__': unittest.main()
- A test file will start with a prefix test_. if the prefix is missing the CI will not run it.
- A test class will start with a prefix Test for example TestLowSevParser1.
- A test class will inherit from the class unittest.TestCase.
- Every test file needs to use an input file that sits next to the test file. in order to get the relative path use:
self.current_dir = os.path.dirname(os.path.realpath(__file__))
- Every test method will start with the prefix test_.
- Use the parse_file method to run the parser and assert the result.
- It's better to first check if the number of the result items is correct and then check each one of them.
- after running the CI, make sure the tests ran: