intelmq.bots.parsers.generic package¶
Submodules¶
intelmq.bots.parsers.generic.parser_csv module¶
Generic CSV parser
Parameters: columns: string delimiter: string default_url_protocol: string skip_header: boolean type: string type_translation: string data_type: string
- intelmq.bots.parsers.generic.parser_csv.BOT¶
alias of
GenericCsvParserBot
- class intelmq.bots.parsers.generic.parser_csv.GenericCsvParserBot(*args, **kwargs)¶
Bases:
ParserBot
Parse generic CSV data. Ignoring lines starting with character #. URLs without protocol can be prefixed with a default value.
- column_regex_search: dict | None = None¶
- columns: str | Iterable = None¶
- columns_required: dict | None = None¶
- compose_fields: dict | None = {}¶
- data_type: dict | None = None¶
- default_url_protocol: str = 'http://'¶
- delimiter: str = ','¶
- filter_text = None¶
- filter_type = None¶
- init()¶
- parse(report)¶
A generator yielding the single elements of the data.
Comments, headers etc. can be processed here. Data needed by self.parse_line can be saved in self.tempdata (list).
Default parser yields stripped lines. Override for your use or use an existing parser, e.g.:
parse = ParserBot.parse_csv
- You should do that for recovering lines too.
recover_line = ParserBot.recover_line_csv
- parse_line(row: list, report)¶
A generator which can yield one or more messages contained in line.
Report has the full message, thus you can access some metadata. Override for your use.
- recover_line(line: list | None = None) str ¶
Recover csv line, respecting saved line ending.
- Parameter:
line: Optional line as list. If absent, the current line is used as string.
- skip_header: bool | int = False¶
- time_format: TimeFormat | None = None¶
- type: str | None = None¶
- type_translation = {}¶