intelmq.bots.parsers.generic package¶

Submodules¶

intelmq.bots.parsers.generic.parser_csv module¶

Generic CSV parser

Parameters: columns: string delimiter: string default_url_protocol: string skip_header: boolean type: string type_translation: string data_type: string

intelmq.bots.parsers.generic.parser_csv.BOT¶: alias of GenericCsvParserBot

class intelmq.bots.parsers.generic.parser_csv.GenericCsvParserBot(*args, **kwargs)¶

Bases: ParserBot

Parse generic CSV data. Ignoring lines starting with character #. URLs without protocol can be prefixed with a default value.

column_regex_search: dict | None = None¶

columns: str | Iterable = None¶

columns_required: dict | None = None¶

compose_fields: dict | None = {}¶

data_type: dict | None = None¶

default_url_protocol: str = 'http://'¶

delimiter: str = ','¶

filter_text = None¶

filter_type = None¶

init()¶

parse(report)¶

A generator yielding the single elements of the data.

Comments, headers etc. can be processed here. Data needed by self.parse_line can be saved in self.tempdata (list).

Default parser yields stripped lines. Override for your use or use an existing parser, e.g.:

parse = ParserBot.parse_csv

You should do that for recovering lines too.: recover_line = ParserBot.recover_line_csv

parse_line(row: list, report)¶

A generator which can yield one or more messages contained in line.

Report has the full message, thus you can access some metadata. Override for your use.

recover_line(line: list | None = None) → str¶

Recover csv line, respecting saved line ending.

Parameter:: line: Optional line as list. If absent, the current line is used as string.

skip_header: bool | int = False¶

time_format: TimeFormat | None = None¶

type: str | None = None¶

type_translation = {}¶

intelmq.bots.parsers.generic package¶

Submodules¶

intelmq.bots.parsers.generic.parser_csv module¶

Module contents¶

Navigation

Related Topics