Parsing BPMN

The example application assumes that a BpmnProcessSpec will be generated for each process independently of starting a workflow and that these will be immediately serialized and provided with a ID. We’ll discuss serialization in greater detail later; for now we’ll simply note that the file serializer simply writes a JSON representation of the spec to a file and uses the filename as the ID.

Note

This is design choice – it would be possible to re-parse the specs each time a process was run.

Default Parsers

Importing

Each of the BPMN modules (bpmn, spiff, or camunda) has a parser that is preconfigured with the specs in that module (if a particular TaskSpec is not implemented in the module, bpmn TaskSpec is used).

bpmn: from SpiffWorkflow.bpmn.parser import BpmnParser
dmn: from SpiffWorkflow.dmn.parser import BpmnDmnParser
spiff: from SpiffWorkflow.spiff.parser import SpiffBpmnParser
camunda: from SpiffWorkflow.camunda.parser import CamundaParser

Note

The default parser cannot parse DMN files. The BpmnDmnParser extends the default parser to add that capability. Both the spiff and camunda parsers inherit from BpmnDmnParser.

Instantiation of a parser has no required arguments, but there are several optional parameters.

Validation

The SpiffWorkflow.bpmn.parser module also contains a BpmnValidator.

The default validator validates against the BPMN 2.0 spec. It is possible to import additional specifications (e.g. for custom extensions) as well.

By default the parser does not validate, but if a validator is passed in, it will be used on any files added to the parser.

from SpiffWorkflow.bpmn.parser import BpmnParser, BpmnValidator
parser = BpmnParser(validator=BpmnValidator())

Spec Descriptions

A default set of decription attributes for each Task Spec. The description is intended to be a user-friendly representation of the task type. It is a mapping of XML tag to string.

The default set of descriptions can be found in SpiffWorkflow.bpmn.parser.spec_descriptions.

Creating a BpmnProcessSpec from BPMN Process

From the add_spec method of our BPMN engine (engine/engine.py):

def add_spec(self, process_id, bpmn_files, dmn_files):
    self.add_files(bpmn_files, dmn_files)
    try:
        spec = self.parser.get_spec(process_id)
        dependencies = self.parser.get_subprocess_specs(process_id)
    except ValidationException as exc:
        self.parser.process_parsers = {}
        raise exc
    spec_id = self.serializer.create_workflow_spec(spec, dependencies)
    logger.info(f'Added {process_id} with id {spec_id}')
    return spec_id

def add_files(self, bpmn_files, dmn_files):
    self.parser.add_bpmn_files(bpmn_files)
    if dmn_files is not None:
        self.parser.add_dmn_files(dmn_files)

The first step is adding BPMN and DMN files to the parser using the add_bpmn_files and add_dmn_files methods.

We use the get_spec to parse the BPMN process with the provided process_id (not the process name).

Note

Ths parser was designed to load one set of files and parse a process and will raise a ValidationException if any duplicate iDs are present. The available processes are immediately added to process_parsers, so re-adding a file will generate an exception. Therefore, if we run into a problem (the specific case here) or wish to reuse the same parser, we need to clear this attribute.

Other Methods for Adding Files

add_bpmn_files_by_glob: Loads files from a glob instead of a list.
add_bpmn_file: Adds one file rather than a list.
load_bpmn_str: Loads and parses XML from a string.
load_bpmn_io: Loads and parses XML from an object implementing the IO interface.
load_bpmn_xml: Parses BPMN from an lxml parsed tree.

Handling Subprocesses and Call Activities

Internally, Call Activities and Subprocesses (as well as Transactional Subprocesses) are all treated as separate specifications. This is to prevent a single specification from becoming too large, especially in the case where the same process spec will be called more than once.

The get_subprocess_specs method takes a process ID and recursively searches for Call Activities, Subprocesses, etc used by or defined in the provided BPMN files. It returns a mapping of process ID to parsed specification.

Other Methods for Finding Dependencies

find_all_specs: Returns a mapping of name -> BpmnWorkflowSpec for all processes in all files that have been provided to the parser at that point.
get_process_dependencies: Returns a list of process IDs referenced by the provided process ID
get_dmn_dependencies: Returns a list of DMN IDs referenced by the provided process ID

Creating a BpmnProcessSpec from a BPMN Collaboration

The parser can also generate a workflow spec based on a collaboration:

def add_collaboration(self, collaboration_id, bpmn_files, dmn_files=None):
    self.add_files(bpmn_files, dmn_files)
    try:
        spec, dependencies = self.parser.get_collaboration(collaboration_id)
    except ValidationException as exc:
        self.parser.process_parsers = {}
        raise exc

A spec is created for each of the processes in the collaboration, and each of these processes is wrapped inside a subworkflow. This means that a spec created this way will always require subprocess specs, and this method returns the generated spec (which doesn’t directly correspond to anything in the BPMN file) as well as the processes present in the file, and theit dependencies.