intelmq.bots.parsers.generic package

Submodules

intelmq.bots.parsers.generic.parser_csv module

Generic CSV parser

Parameters: columns: string delimiter: string default_url_protocol: string skip_header: boolean type: string type_translation: string data_type: string

intelmq.bots.parsers.generic.parser_csv.BOT

alias of GenericCsvParserBot

class intelmq.bots.parsers.generic.parser_csv.GenericCsvParserBot(*args, **kwargs)

Bases: ParserBot

Parse generic CSV data. Ignoring lines starting with character #. URLs without protocol can be prefixed with a default value.

columns: str | Iterable = None
columns_required: dict | None = None
compose_fields: dict | None = {}
data_type: dict | None = None
default_url_protocol: str = 'http://'
delimiter: str = ','
filter_text = None
filter_type = None
init()
parse(report)

A generator yielding the single elements of the data.

Comments, headers etc. can be processed here. Data needed by self.parse_line can be saved in self.tempdata (list).

Default parser yields stripped lines. Override for your use or use an existing parser, e.g.:

parse = ParserBot.parse_csv
You should do that for recovering lines too.

recover_line = ParserBot.recover_line_csv

parse_line(row: list, report)

A generator which can yield one or more messages contained in line.

Report has the full message, thus you can access some metadata. Override for your use.

recover_line(line: list | None = None) str

Recover csv line, respecting saved line ending.

Parameter:

line: Optional line as list. If absent, the current line is used as string.

skip_header: bool | int = False
time_format: TimeFormat | None = None
type: str | None = None
type_translation = {}

Module contents