flexget.plugins.input.regexp_parse module#

class flexget.plugins.input.regexp_parse.RegexpParse[source]#

Bases: object

Designed to take input from a web resource or a file.

It then parses the text via regexps supplied in the config file.

source: is a file or url to get the data from. You can specify a username:password

sections: Takes a list of dicts that contain regexps to split the data up into sections. The regexps listed here are used by find all so every matching string in the data will be a valid section.

keys: hold the keys that will be set in the entries

key:

regexps: a list of dicts that hold regexps. The key is set to the first string that matches any of the regexps listed. The regexps are evaluated in the order they are supplied so if a string matches the first regexp none of the others in the list will be used.

required: a boolean that when set to true will only allow entries that contain this key onto the next stage. url and title are always required no matter what you do (part of flexget)

#TODO: consider adding a set field that will allow you to set the field if no regexps match

#TODO: consider a mode field that allows a growing list for a field instead of just setting to: # first match

Example config

regexp_parse:
  source: http://username:password@ezrss.it/feed/
  encoding: "utf-8"
  sections:
    - {regexp: "(?<=<item>).*?(?=</item>)", flags: "DOTALL,IGNORECASE"}

  keys:
    title:
      regexps:
        - {regexp: '(?<=<title><!\[CDATA\[).*?(?=\]\]></title>)'} #comment
    url:
      regexps:
        - {regexp: "magnet:.*?(?=])"}
    custom_field:
      regexps:
        - {regexp: "custom regexps", flags: "comma separated list of flags (see python regex docs)"}
      required: False
    custom_field2:
      regexps:
        - {regexp: 'first custom regexps'}
        - {regexp: 'can't find first regexp so try this one'}

compile_regexp_dict_list(re_list)[source]#: Turn a list of dicts containing regexps information into a list of compiled regexps.

flagstr_to_flags(flag_str)[source]#: Turn a comma separated list of flags into the int value.

isvalid(entry)[source]#: Check to make sure that all required fields are present in the entry.

on_task_input(**kwargs)#

FLAG_REGEX = '^(\\s?(DEBUG|I|IGNORECASE|L|LOCALE|M|MULTILINE|S|DOTALL|U|UNICODE|X|VERBOSE)\\s?(,|$))+$'#

FLAG_VALUES = {'DEBUG': re.DEBUG, 'DOTALL': re.DOTALL, 'I': re.IGNORECASE, 'IGNORECASE': re.IGNORECASE, 'L': re.LOCALE, 'LOCALE': re.LOCALE, 'M': re.MULTILINE, 'MULTILINE': re.MULTILINE, 'S': re.DOTALL, 'U': re.UNICODE, 'UNICODE': re.UNICODE, 'VERBOSE': re.VERBOSE, 'X': re.VERBOSE}#

schema = {'$defs': {'regex_list': {'items': {'additionalProperties': False, 'properties': {'flags': {'error_pattern': 'Must be a comma separated list of flags. See python regex docs.', 'pattern': '^(\\s?(DEBUG|I|IGNORECASE|L|LOCALE|M|MULTILINE|S|DOTALL|U|UNICODE|X|VERBOSE)\\s?(,|$))+$', 'type': 'string'}, 'regexp': {'format': 'regex', 'type': 'string'}}, 'required': ['regexp'], 'type': 'object'}, 'type': 'array'}}, 'additionalProperties': False, 'properties': {'encoding': {'type': 'string'}, 'keys': {'additionalProperties': {'additionalProperties': False, 'properties': {'regexps': {'$ref': '#/$defs/regex_list'}, 'required': {'type': 'boolean'}}, 'required': ['regexps'], 'type': 'object'}, 'required': ['title', 'url'], 'type': 'object'}, 'sections': {'$ref': '#/$defs/regex_list'}, 'source': {'anyOf': [{'format': 'url', 'type': 'string'}, {'format': 'file', 'type': 'string'}]}}, 'required': ['source', 'keys'], 'type': 'object'}#

flexget.plugins.input.regexp_parse.register_plugin()[source]#