Introduction

This section explains how to delete documents from Shopware with the output-module. The following entities/documents are deletable:

Document Shopware Table Delete Modes
Category category soft/hard
Customer customer soft/hard
Product product soft/hard
PropertyOption property_group hard
PropertyValue property_group_option hard

This does not mean that all other document types are not deletable at all. Please contact us, if you need to delete other document types.

Delete Modes

There are two delete modes available: hard and soft. While a "hard delete" deletes the entity from Shopware, a "soft delete" will set the target entity to inactive. Since not every entity provides an active field in Shopware the soft delete is not supported for every document type.

Delete Mode Configuration

The delete-mode must be configured individually for each subsection. Example:

{
    "...": "...",
    "subsections": {
        "subsectionName": {
            "deleteMode": "{DELETE_MODE}"
        }
    }
}  

The module does not delete any entities without actively authorizing it via configuration. If you do not want to delete documents for a subsection you can simply omit the deleteMode entry (or enter a value that is not equal to "hard" or "soft").

Delete Timestamps

As soon as a document contains a value for the deletedAt timestamp it will be handled by the module as deleted. Even if the document contains a value for updatedAt or createdAt that is newer than the value for deletedAt it will still be handled as deleted.

Delta Deletion

Removing a document from your data set does not lead to its deletion in Shopware. There is no delta-check in place whatsoever. In order to delete a document from Shopware you have to set a valid value the field deletedAt.

Heuristic Anomaly Detection

This feature prevents the unexpected deletion of large amounts of data at once. The main purpose is to intercept possible errors of input-modules that mistakenly provide all documents with a deletedAt timestamp - e.g. caused by incorrect delete commands from a data source that are not detected by the input module.

Overview

So how does the anomaly detection work? From the user's point of view, only the following summary about this mode is really relevant.

  • The module analyzes each delete-progress and detects the amount of documents that will actually be deleted from Shopware (documents already deleted in previous executions are respected in the anomaly calculations).
  • If an anomaly is detected ( = >n % of entities of the shop will be deleted) the module will stop and persist a confirmation string
  • This confirmation string must be added to the configuration at lockouts.deleteAnomalyConfirmation
  • During the next execution the anomaly detection is suppressed for the affected subsection
  • The persisted string is deleted when the input is correct. So the anomaly-detection is active again during the next flow execution.

If you are really interested in how the details work you can read the section "Details - How Delete Anomalies are Detected" from below.

Configuration

{
    "lockouts": {
        "deleteAnomalyConfirmation": "..."
    }
}

Threshold

The default threshold used by the module is 75%. The threshold defines the percentage of entities from the target shop that have to be deleted from Shopware in order to stop the execution.

Example: If >=75% of all products will be deleted you have to confirm this via configuration.

You can set the threshold globally via configuration:

{
    "deleteAnomalyThresholdPercentage": 75
}

You can also override this value specifically for each subsection:

{
    "...": "...",
    "subsections": {
        "subsectionName": {
            "deleteAnomalyThresholdPercentage": 35
        }
    }
}  

Details - How Delete Anomalies are Detected

Mapping subsections that support the deletion of entities contain the subsections delta-count and anomaly-check. Those subsections as well as the progress initialization are part of the delete anomaly detection process.

In the following all steps to detect anomalies are explained.

Step 1 - Analyze Progress

The subsection init-progress does two things in order to prepare the anomaly detection:

  • initialize the delete-progresses with the number of documents with a valid deletedAt timestamp
  • compares each delete-progress with the number of entities available in Shopware

By comparing the delete-progress with the entity count in Shopware the module is able to detect "potential delete anomalies".
A potential anomaly is present when the amount of documents to delete is higher than the configured threshold. This is what we call a "progress anomaly". Note that we do not know how many documents are actually going to be deleted yet - so we do not know if the "progress anomaly" is actually a "delete anomaly". We have to make a distinction between progress- and delete-anomolies due to a very common scenario for full-imports:

  • Module runs full-imports every N days
  • The transfer database contains 900 products with a deletedAt timestamp, Shopware contains 1000 products
  • The init-progress subsection finds a potential delete anomaly: 90% of all products are potentially going to be deleted
  • But: 899 of the 900 products are already deleted by previous runs, the actual amount to delete is 1
  • That is why the heuristic detection cannot take place after the progress initialization - the detection needs access to the cached entities in order to calculate the actual amount of documents to which a Shopware entity is (still) present

To avoid having the user to confirm potential anomalies on every full import there are two additional subsections in place: delta-count and anomaly-check that are responsible to calculate the amount of entities that will actually be deleted in the current flow execution.

Step 2 - Delta Count

If no "progress-anomaly" was detected during the progress-initialization this subsection finishes immediately. If the progress is suspicious the affected documents will be further analyzed in the subsection delta-count. This subsection checks for each document if its corresponding Shopware entity is still available. This allows the calculation of the actual amount of entities that are going to be deleted in the current flow execution in the next subsection anomaly-check.

Step 3 - Anomaly Check

The only reason why delta-count and anomaly-check are two separate subsections is to be able to generate the delta counts in parallel. After the delta-map was generated the actual anomaly check takes place. If the amount of entities to be deleted is higher than the configured threshold a lockout string is generated and the delete-subsection will not run.