Developer Guide to Transaction Screening (with Code Samples)

ISO 20022 is a game-changer as it sets a methodology to develop common financial messaging standards, based on a syntax-agnostic business dictionary. It is commonly implemented in XML but may be represented in any other relevant format including JSON for APIs.

The various timelines for the global adoption of ISO 20022 have seen several changes. Despite shifting timelines, it would be a big mistake to view the ISO 200022 migration as just another faraway IT project. The journey toward ISO 20022 is definitely moving ahead.

For those in the correspondent banking space, here is a recap on the global ISO 20022 roadmap underway.

ISO 20022 roadmap

The migration to ISO20022 is planned to start in November 2022 for the correspondent banking space with the introduction of a central Transaction Management Platform (TMP) aiming to reduce the cost and complexity of ISO 20022 adoption.

The TMP will provide interoperability between MT (FIN based), ISO 20022 messaging (FINPlus based) and API (ISO 20022-based JSON) channels by holding a central copy of the complete payment data accessible to every bank in the payment chain in their chosen format and via their chosen channel.

What benefit does the TMP bring to banks? They can adopt ISO 20022 at their own pace during the coexistence phase where ISO 20022 and MT messages will exist in parallel beyond November 2022 for a period of 3 years.

The decommissioning of MT messages is planned for November 2025. This holds true for Category 1 (Customer Payments and Cheques), Category 2 (Financial Institution Transfers) and Category 9 (Cash Management and Customer Status) message types.

Impact of ISO 20022 on Sanctions Screening

What remains unchanged is the need for every agent in the transaction to fulfill their compliance obligations.

Since the publication of the revised FATF Recommendation 16 in 2013 extending the scope of the former SR VII, the complete originator (payer) and beneficiary customer (payee) data has become critical for all involved parties in the payments chain of cross-border payments and domestic wire transfers. All party details are typically required by agents for sanctions screening.

Compared to the SWIFT FIN MT format, ISO 20022 provides significantly enhanced data quality in payment messages, including a much more granular structure for the individual address elements.

However, the coexistence phase will continue to pose substantial challenges, such as the risk of rich MX data being truncated when translated to MT format.

Aside from data translation issues, the legacy of unstructured data channeled through MT messages remains another persistent compliance challenge for agents.

Unstructured Data: a Major Challenge

The issue with unstructured data occurs when the data elements such as name, address, customer identification and country of residence are bundled in a string of characters. As the data within MT messages is usually mapped to several free format lines (e.g. 4 x 35 characters), it is difficult to identify which line holds which data element for consistent and unambiguous matching against sanction lists. For instance, there is no straightforward method to recognize a country name or a country code in a string of text.

The obvious answer to the above challenge is structured data. Unfortunately, statistics show that the use of MT103 messages with unstructured data remains predominant in the industry.

For example, as highlighted in a recent SWIFT whitepaper, SWIFT traffic between September 2017 and May 2018 shows a harsh reality:

“72-94% of party data fields in payments (field 50 and field 59) use free-format options with unstructured data to identify parties, with potentially vague or missing critical information needed to effectively screen and process payments. Financial institutions cite up to 10% of payments requiring manual intervention as a consequence.”

Truncation of data, handling of ultimate payer and payee information, separation of different address elements, mapping with different messaging standards. All those challenges pertaining to the format, the quality and the completeness of data are stumbling blocks to ensuring full compliance with sanctions requirements.

Let’s figure out practically how Screena API can help to address the challenges inherent to the use of MT free format fields – with a specific focus on field 50a – in light of industry practices, including the Market Practice Guidelines for use of field 50a issued by the Payments Market Practice Group (PMPG).

ℹ️ Important information – Please note that relevant regulations and any applicable legislation take precedence over the guidance notes contained in this article. This article represents Screena’s best effort to assist peer partners, clients and prospects in the interpretation and implementation of the relevant topic(s). Screena cannot be held responsible for any error in this article or any consequence thereof.

About Field 50a

In MT 103 messages, field 50a (where “a” means that multiple option letters are available) is used for ordering customer information. Field 50a has three options – A, F and K.

The formats for the three options are:

	Format	Subfields
Option A	[/34x] 4!a2!a2!c[3!c]	(Account) (BIC)
Option F	35x 4*(1!n/33x)	(Party Identifier) (Name & Address)
Option K	[/34x] 4*35x	(Account) (Name & Address)

For simplicity, we will leave aside option A as the latter is either appropriate when the ordering customer is a corporation with a non-financial institution Business Identifier Code (non-financial institution BIC) assigned to it, or when a bank makes payments on its own behalf and the bank’s Business Identifier Code (financial institution BIC) is used as the identifier.

We kindly advise you to check out our previous article where we explain how to screen a BIC. You can also take a look at our short video tutorial presenting two common methods for screening financial institutions.

Field 50a with Option F

Field 50a with option F provides the possibility to clearly structure the ordering customer information.

Having said that and as mentioned above, the industry has not yet seen a significant shift to option F despite the promise it would slowly but surely become the preferred option in replacement of the free-format option K in field 50a. But we will tackle later the shortcomings of option K and how to overcome them.

In option F, the following line formats must be used:

	Format	Data Elements
Line 1 (subfield Party Identifier)	/34x or 4!a/2!a/27x	(Account) or (Code)(Country Code)(Identifier)
Lines 2-5 (subfield Name and Address)	1!n/33x	(Number)(Details)

Each line of subfield 2 Name and Address – when present – must start with a number between 1 and 8 followed by a slash. Those numbers are used to identify ordering customers’ information, including name, address, country and town of residence. For further details, check out the official Field 50a Definition from SWIFT Knowledge Center.

Preferred Usage of Option F

In the context of sanctions screening, the PMPG specifies that the preferred usage of option F is:

	Description
Line 1	/Account number
Line 2	1/Name of the ordering customer
Line 3	2/Address details
Line 4	3/Country code/Town

Here is an example of field 50a with option F.

:50F:/DE25481230000001998736
1/FRITZ MUSTER 
2/ROSENWEG 6
3/DE/DRESDEN

Let’s now turn out to Screena API. This code example executes a search of the above ordering customer against the United Nations list based on the data elements provided in field 50a with the preferred usage of option F.

{
	"queries": [{
		"sourceData": [{
			"dataID": "DE25481230000001998736",
			"names": [{
				"fullName": "FRITZ MUSTER"
			}],
			"addresses": [{
				"street": "ROSENWEG 6",
				"city": "DRESDEN",
				"country": "DE"
			}]
		}],
		"targetData": [{
			"datasets": [{
				"label": "UN"
			}]
		}],
		"threshold": 0.85,
		"entityTypeAlgo": {
			"type": "exact_match",
			"exclude": [
				"vessel", "aircraft"
			]
		},
		"addressAlgo": {
			"type": "same_region",
			"nullMatch": true
		}
	}]
}

If we take a closer look at the parameters of this search query, we see many interesting things.

Name Matching Threshold

First of all, the threshold value applied for name matching is 85%.

By default (i.e. when the attribute threshold is not specified in a search query), Screena sets the threshold value at 60%, which provides maximum recall as requested by the majority of users. If you are not yet familiar with such terms as recall and precision, this Wikipedia article is definitely worth a read.

You can use the threshold attribute to adjudicate the relative yield of true-positive versus true-negative cases.

Raising the threshold, as in the example above, would result in fewer matches, which would increase the possibility of false negatives, and vice-versa.

Therefore, finding the optimal threshold value for name matching mainly depends on the firm’s risk appetite. Its risk posture shall take many factors into consideration. Above all, the quality and the completeness of the data that has to be matched are critical. In this regard, extensive user acceptance tests (UAT) are an absolute must before determining a suitable value in live conditions.

Let’s add that Screena API allows the setting of different threshold values per jurisdiction, per transaction, and per payment element – including party, bank identification, and free format narrative fields. We will unveil in another article how to implement such a risk-based screening strategy with maximum granularity.

Entity Type Matching

In most cases, when screening MT 103 messages, there is no unique and unambiguous way to identify whether a party is an individual or an organization.

As seen earlier, there is one exception – when the field 50a is used with option A. In this case, the party is always an organization (i.e. either a corporation or a bank).

As a consequence, leaving option A aside, the risk of returning irrelevant hits is significantly higher. Using the option exclude within entityTypeAlgo helps to exclude matches for specific entity types (e.g. vessel or aircraft) that – by definition – cannot appear within party fields. This is a good way to reduce obviously false hits, such as those of vessels having names that are also common names of individuals (e.g. Christina or Mariana).

Country Matching

Lastly, when combined with the preferred usage of option F, field 50a provides structured address information, including country code that can be used as a secondary attribute to retain or discard a hit resulting from name matching.

In this case, it is possible to specify addressAlgo with one of the following values: “same_country”, “same_subregion”, or “same_region”. Depending on the chosen value, the country provided with the party will be matched against the corresponding list element based on the United Nations geoscheme. To learn more about the United Nations geoscheme grouping: https://developer.screena.ai/#un-grouping.

You will also note the activation of the option nullMatch. This parameter is used to determine whether a match should be returned when the attribute associated with an algo is either empty or not provided within the list(s).

As a rule of thumb, knowing that the lists are not always enriched with all the attributes specified in a search query, we recommend always setting this parameter at true. As a precaution, even if you forget to specify this parameter in your search query, Screena will activate it by default.

⚠️ Watch out – it is not always advisable to use the country code of a party as a secondary matching criterion. By nature, the address of an individual or an organization can change frequently. Choosing whether country matching is suitable as a complement to name matching depends on how reliable the information is within transactions as well as within lists. Here again, the decision should be made in accordance with the firm’s risk appetite.

Other Possible Usage of Option F

If no address details are available, then the date and place of birth of the ordering customer or a unique customer identification number or a national identity card number of the ordering customer must be provided.

Here is an example of option F with name and date and place of birth.

:50F:/BE30001216371411
1/PHILIPS MARK
4/19720830
5/BE/BRUSSELS

This code example executes a search of the above ordering customer against the European Union list based on the data elements provided in field 50a with the other possible usage of option F.

{
	"queries": [{
		"sourceData": [{
			"dataID": "BE30001216371411",
			"names": [{
				"fullName": "PHILIPS MARK"
			}],
			"datesOfBirth": [{
				"date": "1972-08-30"
			}],
			"placesOfBirth": [{
				"city": "BRUSSELS",
				"country": "BE"
			}]
		}],
		"targetData": [{
			"datasets": [{
				"label": "EU"
			}]
		}],
		"threshold": 0.85,
		"entityTypeAlgo": {
			"type": "exact_match",
			"exclude": [
				"vessel", "aircraft"
			]
		},
		"dateOfBirthAlgo": {
			"type": "same_year",
			"nullMatch": true
		},
		"placeOfBirthAlgo": {
			"type": "same_region",
			"nullMatch": true
		}
	}]
}

In this case, addresses and addressAlgo have been substituted with placesOfBirth and placeOfBirthAlgo. On top of it, the parameters datesOfBirth and dateOfBirthAlgo have been added to the search query so as to enforce matches against list records with either an empty date of birth (i.e., parameter nullMatch set at true) or the same year of birth.

Field 50a with Option K

Field 50a with option K provides ordering customer information in a less structured format.

Although posing serious challenges for sanctions compliance, option K is still largely the dominant format for MT 103 messages. This said, the PMPG clearly states that “correct use of option K is an acceptable means of complying with FATF recommendations”.

Let’s see how to get it done right.

Correct usage of Option K

Line 1 must include the account number of the ordering customer, preceded by a slash. If no account number is available, then Line 1 must carry a unique identifier, preceded by a slash.

Line 2 must carry the name of the ordering customer.

Lines 3-5 are for address details of the ordering customer in line with the generic format specifications for a 4*35x field.

Here is an example of field 50a with the correct usage of option K.

:50K:/BE68539007547034
DUPONT MARTIN
AVENUE DU LAC 26
1000 BRUSSELS – BE

This code example executes a search of the above ordering customer against the European Union list based on the data elements provided in field 50a with the correct usage of option K.

{
	"queries": [{
		"sourceData": [{
			"dataID": "BE68539007547034",
			"names": [{
				"fullName": "DUPONT MARTIN"
			}],
			"addresses": [{
				"fullAddress": "AVENUE DU LAC 26 1000 BRUSSELS – BE"
			}]
		}],
		"targetData": [{
			"datasets": [{
				"label": "EU"
			}]
		}],
		"threshold": 0.75,
		"entityTypeAlgo": {
			"type": "exact_match",
			"exclude": [
				"vessel", "aircraft"
			]
		},
		"addressAlgo": {
			"type": "same_region",
			"nullMatch": true
		}
	}]
}

Unlike option F which provides structured information for addresses, the address details contained in option K have to be posted in an unstructured format using another field fullAddress.

When fullAddress is used in combination with addressAlgo, Screena will parse the content of the field and extract any countries possibly detected within it. This setting helps to use unstructured address data for country matching on top of name matching.

⚠️ Watch out – As for option F, firms should decide to configure country matching based on option K in accordance with their risk appetite. Even more so as the geographic information provided in unstructured format is by nature prone to detection errors.

For information, Screena Geolocation feature is based on GeoNames, a geographical database that contains over eleven million placenames. Screena Geolocation is also used to screen free-format fields against Embargo data and identify a possible connection with sanctioned countries. We will bring country-based screening into focus in our next article.

Truncated Names

Put that way, option K seems to pose no particular problem. As always, the devil is in the details.

One obvious issue comes out when the name of the ordering customer does not fit onto a single line of 35 characters.

Here is one example for field 50a with option K where the name of the ordering customer exceeds 35 characters and has to be continued on a second line of the 4*35x field.

:50K:/BE68539007547034
SHOE SHINE IMPORT AND EXPORT
ZANG MI TRADING CORPORATION
AVENUE DU LAC 26
1000 BRUSSELS – BE

This length issue has been addressed in the ISO 20022 payments messages where the name has been extended to 140 characters.

When it comes to MT 103 messages, option F is the preferred approach as it allows for continuing the name of the ordering customer on several lines provided each line starts with number 1 followed by a slash. See the same example converted to option F.

:50F:/BE68539007547034
1/SHOE SHINE IMPORT AND EXPORT
1/ZANG MI TRADING CORPORATION
2/AVENUE DU LAC 26
3/BE/BRUSSELS/1000

When option F is definitely not possible, Screena provides two workarounds to cater to long names.

The first option is to stick to the first line that contains the name of the ordering customer, and thus, disregard the second line for name screening. Although this workaround is risky as it increases the possibility of false negatives, Screena API embeds algorithms to detect truncated names. Furthermore, our machine learning models have been specifically trained to deal with such cases and make more accurate predictions than traditional fuzzy logic algorithms (e.g., Levenshtein, Jaro Winkler) in the event of truncated or missing name components.

Let’s give it a try with a real example. This code example posted to Screena One’s name matching endpoint computes how similar two names of different lengths are.

{
	"queries": [{
		"sourceData": [{
			"names": [{
				"fullName": "Mohammed Ben Salmane"
			}]
		}],
		"targetData": [{
			"names": [{
				"fullName": "Muhammad Bin Salman Bin Abd Al-Aziz Al Saud"
			}]
		}]
	}]
}

This query will return a pretty high score of 79% as compared to the Levenshtein algo which returns 37% or Jaro Winkler which returns 66%.

{
    "responseHeader": {
        "requestID": "GEN2054093905-1646662965061",
        "responded": "2022-03-07T14:22:45.061Z"
    },
    "results": [
        {
            "matchType": "st_match",
            "score": 0.79,
            "sourceData": [
                {
                    "entityType": "unknown",
                    "names": [
                        {
                            "fullName": "Mohammed Ben Salmane",
                            "normalized": "mohamed bin salmane"
                        }
                    ],
                    "cultureCodes": [
                        "ar",
                        "bn",
                        "de",
                        "en",
                        "nl",
                        "zh"
                    ]
                }
            ],
            "targetData": [
                {
                    "entityType": "unknown",
                    "names": [
                        {
                            "fullName": "Muhammad Bin Salman Bin Abd Al-Aziz Al Saud",
                            "normalized": "mohamed bin salman bin abd al aziz al saud"
                        }
                    ],
                    "cultureCodes": [
                        "ar"
                    ]
                }
            ]
        }
    ]
}

⚠️ Watch out – Although Screena was designed to address the most complex challenges in the field of name matching, please have in mind that this approach is no silver bullet and requires extensive UAT to determine an optimal threshold value.

With option K, you ought to consider lowering the threshold value as a way to mitigate the risk of false negatives due to truncated names. Doing so will definitively come at the cost of more false positives, which is the price to pay for leaving the door open to option K.

Narratives

When the first workaround is not a good fit, then narratives come to the rescue.

In case you missed it, we already introduced narratives in our first article. In a nutshell, narratives are free format fields that can contain information about named entities, including party elements specified within payment messages in an unstructured format.

Therefore, Screena field narrative is a suitable option to screen field 50a when correct usage of option K is not guaranteed. However, this approach is only advisable in a worst-case scenario. Here is a typical example.

:50K:/BE3000121637141
GRAND CAYMAN ENERGY SERVICES 
CORPORATION P.O. BOX 12345
340 NORTH SOUTH ROAD
GEORGE TOWN KY1-111, CAYMAN ISLANDS

In this case, as the ordering customer name is longer than 35 characters (i.e. “GRAND CAYMAN ENERGY SERVICES CORPORATION”), the excess information (i.e., “CORPORATION”) is populated in the second line, mingled with the remaining address information (i.e., “P.O. BOX 12345 340 NORTH SOUTH ROAD GEORGE TOWN KY1-111, CAYMAN ISLANDS”).

If you chose to go with the first approach described above, the excess name information will be mapped to the first address line, forming part of the address information. This results in the name element not reflecting the full entity name, while it also pollutes the address information, which may lead to detection mistakes like – in the worst-case – false negatives.

We highlighted before that Screena is built to tackle those issues to the extent that name elements are not obfuscated in a way that would even make them hard to decipher for humans.

When all other options are not suitable, firms can choose to screen field 50a using the field narrative as a last resort.

This code example executes a search of the above ordering customer against the UK list based on the data elements provided in field 50a when the correct usage of option K is uncertain.

{
	"queries": [{
		"sourceData": [{
			"dataID": "BE3000121637141",
			"narrative": "GRAND CAYMAN ENERGY SERVICES CORPORATION P.O. BOX 12345 340 NORTH SOUTH ROAD GEORGE TOWN KY1-111, CAYMAN ISLANDS"
		}],
		"targetData": [{
			"datasets": [{
				"label": "UK"
			}]
		}],
		"threshold": 0.87,
		"entityTypeAlgo": {
			"type": "exact_match",
			"exclude": [
				"vessel", "aircraft"
			]
		},
		"addressAlgo": {
			"type": "same_region",
			"nullMatch": true
		}
	}]
}

This method is likely to return the highest false-positive rate as there is no direct and unambiguous way to differentiate party name from address elements. Said differently, each and every component within the field narrative will be screened against the list, regardless of which type of data element it represents. Typically, false positives will be returned when the address associated with the ordering customer within field 50a contains elements that match with sanctioned entities.

Firms might consider setting the threshold value above 85% as a way to circumvent this problem. As always, UAT is an absolute must before sticking to a threshold value suitable for production.

As seen in our first article, let’s also remind that the field narrative can be used to screen other free-format fields in MT 103 payment messages, including field 70 – Remittance Information, and field 72 – Sender to Receiver Information.

About Field 59a

When it comes to the beneficiary customer information, the same screening strategy can be applied as the field 59a also has 3 three options – A, F and no letter.

The formats for the three options are:

	Format	Subfields
Option A	[/34x] 4!a2!a2!c[3!c]	(Account) (BIC)
Option F	[/34x] 4*(1!n/33x)	(Account) (Name & Address)
No letter option	[/34x] 4*35x	(Account) (Name & Address)

No letter option is the equivalent to the less structured option K of field 50a.

For further details, check out the official Field 59a Definition from SWIFT Knowledge Center.

Next Readings

Our first article presented a high-level overview of the data elements within transactions relevant for screening.

This article set out guidelines for the structuring of ordering and beneficiary customer information in fields 50a and 59a – including the screening of MT messages with free format options. Our next article discusses how to use country-based screening data for countries and territories that are subject to broad embargo restrictions.

Our remaining articles will provide some good practices for custom screening strategies.

To get started with transaction screening, get in touch with our sanctions compliance experts today. You can request an API key for free and use our cloud-based sandbox environment for 30 days.

Developer Guide to Transaction Screening (with Code Samples) – Part 2