kodexa.steps
Some example steps that can be used locally
Submodules
Package Contents
Classes
A node tagger allows you to provide a type and content regular expression and then |
|
The NodeTagCopy action allows you select nodes specified by the selector and create copies of the existing_tag (if it exists) with the new_tag_name. |
|
Parser to load a source file as a text document. The text from the document may be placed on the root ContentNode or on the root's child nodes (controlled by lines_as_child_nodes). |
|
The rollup step allows you to decide how you want to collapse content in a document by removing nodes |
- class kodexa.steps.NodeTagger(selector, tag_to_apply, content_re='.*', use_all_content=True, node_only=False, node_tag_uuid=None)
A node tagger allows you to provide a type and content regular expression and then tag content in all matching nodes.
It allows for multiple matching groups to be defined, also the ability to use all content and also just tag the node (ignoring the matching groups)
- selector
The selector to use to find the node(s) to tag
- content_re
A regular expression used to match the content in the identified nodes
- use_all_content
A flag that will assume that all content should be tagged (there will be no start/end)
- tag_to_apply
The tag to apply to the node(s)
- node_only
Tag the node only and no content
- node_tag_uuid
The UUID to use on the tag
- process(document)
- class kodexa.steps.NodeTagCopy(selector, existing_tag_name, new_tag_name)
The NodeTagCopy action allows you select nodes specified by the selector and create copies of the existing_tag (if it exists) with the new_tag_name. If a tag with the ‘existing_tag_name’ does not exist on a selected node, no action is taken for that node.
- selector
The selector to match the nodes
- existing_tag_name
The existing tag name that will be the source
- new_tag_name
The new tag name that will be the destination
- process(document)
- class kodexa.steps.TextParser(encoding='utf-8', lines_as_child_nodes=False)
Parser to load a source file as a text document. The text from the document may be placed on the root ContentNode or on the root’s child nodes (controlled by lines_as_child_nodes).
- encoding
The encoding that should be used when attempting to decode data (default ‘utf-8’)
- lines_as_child_nodes
If True, the lines of the file will be set as children of the root ContentNode; otherwise, the entire file content is set on the root ContentNode. (default False)
- decode_text(data)
- process(document)
- class kodexa.steps.RollupTransformer(collapse_type_res=None, reindex: bool = True, selector: str = '.', separator_character: str = None, get_all_content: bool = False)
The rollup step allows you to decide how you want to collapse content in a document by removing nodes while maintaining content and features as needed
- process(document)
- is_node_in_list(node, node_ids)
- Parameters
node –
node_ids –
Returns:
- exception kodexa.steps.KodexaProcessingException(message, description, advice=None, documentation_url=None)
Bases:
ExceptionThis is a specialized exception, if thrown while in the Kodexa Platform we will include the additional exception details so that they can be presented back to the user
- description
The description of the problem, this is longer description
- advice
Any advice on how to handle the problem, this can also include markdown to help present possible solutions
- message
A short message to express the problem
- documentation_url
A link to a URL where the user might find more information on the problem
- __str__()
Return str(self).