Discovering Screena: Frequently Asked Questions

When people ask us: "what is Screena?", we usually answer it is a next-generation AI-driven SaaS screening API to accurately match, search and resolve named entities across datasets – at speed and at scale. For instance, financial services firms use Screena to screen their counterparties against large regulatory watchlists to meet their Know-Your-Customer (KYC) and Anti-Money Laundering (AML) obligations. Likewise, payment service providers integrate Screena into their payment platforms to ensure sanctions compliance when processing cross-border transactions. But this doesn't tell much about how Screena works. So we have gathered a list of frequently asked questions to know more about Screena's capabilities and find out how different Screena is from other watchlist screening solutions.

Data Processing

Speed and Real-Time Capabilities

Pricing

Solution Deployment

Security and Audit

Data Processing

Does Screena provide rules-based and/or fuzzy matching capabilities?

Screena name-matching approach is twofold with a combination of deterministic rules and predictive scoring methods to minimize false negatives (i.e. achieving high recall) and false positives (i.e. achieving high precision).

Rules-based and fuzzy matching is the core approach to ensure high recall before reducing false positives through our machine learning models.

We first employ traditional edit distance algorithms such as Jaro to measure string similarity when initiating searches on lists. We also apply rules-based algorithms and use proprietary name libraries to detect specific name patterns that can not be addressed through string distance alone. These include, but are not limited to:

  • Name order variations and missing name components
  • Misspellings and errors (inverted letters, missing letters, substituted letters)
  • Truncated names
  • Name concatenations
  • Acronyms and initials
  • Nicknames, synonyms and common aliases
  • Titles, honorifics and company legal forms
  • Phonetic resemblances
  • Detection of stopwords and weighting of common words
  • Detection of locations (cities, towns, regions, ports)
  • Numbers variation
  • Domain names

Does Screena have any machine learning or AI capabilities? If yes, at which part of the process are they used?

Machine learning prediction scoring is at the core of our matching engine to reduce false positives when simple deterministic rules are not accurate enough to immediately retain or reject candidate hits.

We first identify the typology and cultural context of both customer and list names. Detecting the typology and the culture of a name is instrumental in directing candidate hits to the most suitable machine learning models.

We then have distinct supervised machine learning models specifically trained across hundred thousand names for either individuals or organization names. For individuals, our models are segmented across distinct cultural groups. We train each model with specific name datasets. Our machine learning models encompass more than 150 different name matching features to increase the accuracy of name similarity scoring and dramatically reduce the number of false positives.

We are also developing new AI technics to automatically detect geographical elements contained within unstructured fields in the context of transaction screening. Such technics can help to accurately parse named-entity and addresses, ensuring adequate matching of semantically similar elements against entity and embargo lists.

Which lists are available out of the box?

The table below lists all the datasets available on Screena’s sandbox and populated with watchlist entities continuously published by government authorities or international organizations. You can try out Instant Search and manually search for an individual, an organization, or a vessel on any of these lists.

Dataset Label Description
UN UN Security Council Consolidated Sanctions
Global • United Nations Security Council (UN SC)
WORLD BANK WorldBank Debarred Providers
Global • World Bank
EU EU Financial Sanctions Files (FSF)
European Union • European External Action Service
USA US Trade Consolidated Screening List (CSL)
United States • Department of the Commerce - International Trade Administration
OFAC SDN US OFAC Specially Designated Nationals (SDN) List
United States • Office of Foreign Assets Control (OFAC)
OFAC CONSOLIDATED US OFAC Consolidated (non-SDN) List
United States • Office of Foreign Assets Control (OFAC)
UK UK OFSI Consolidated List of Asset Freeze Targets
United Kingdom • Office of Financial Sanctions Implementation
UK INVESTMENT BAN UK OFSI Russia: List of persons named in relation to financial and investment restrictions
United Kingdom • Office of Financial Sanctions Implementation
AUSTRALIA Australian Sanctions Consolidated List
Australia • Department of Foreign Affairs and Trade (DFAT)
FRANCE French Freezing of Assets
France • Ministry of Economy, Finance, and Recovery
SWITZERLAND Swiss SECO Sanctions/Embargoes
Switzerland • State Secretariat for Economic Affairs (SECO)
CANADA Canadian Special Economic Measures Act Sanctions
Canada • Global Affairs Canada
WORLD LEADERS CIA World Leaders
United States • Central Intelligence Agency (CIA)
INTERPOL INTERPOL Red Notices
Global • INTERPOL
FBI FBI Most Wanted
United States • FBI

How is watchlist data loaded into Screena?

Watchlist data can be uploaded either automatically from public or commercial sources, or on demand (private lists).

Public watchlists (OSINT) are crawled in their original formats (XML, CSV, JSON). Likewise, commercial off-the-shelve lists (COTS) are uploaded in their native formats – including full stock and deltas.

Private lists can be uploaded through the API in JSON or in CSV format. All private lists containing personally identifiable information are encrypted in transit and at rest.

Is Screena watchlist agnostic?

Screena is 100% watchlist agnostic. We can source OSINT, COTS and private lists. There is no limit whatsoever with regards to the number and size of lists that can be uploaded.

All lists are automatically mapped with Screena’s named entity data model with no development effort for our clients. Whenever technically possible, we ensure data normalization across all lists for fields containing dates, countries or nationalities.

We also provide endpoints to import blacklisted countries, CTRP lists (cities, towns, regions, ports) and custom white lists.

How frequently is data updated within Screena?

By default, public watchlists (OSINT) are refreshed into Screena every 3 hours. It is possible to adjust that frequency based on risk appetite.

Commercial off-the-shelve lists are scheduled for update as soon as data is published by vendors.

Do you provide proprietary watchlist data?

On top of the main OSINT and COTS watchlists, we provide proprietary watchlist data for country-based/embargo screening. We natively parse all cities across the globe with a population > 500. We support alternative spellings of cities, countries and nationalities in various languages.

We also provide data integration with the ICIJ’s Offshore Leaks databases.

With regards to COTS, we partner with all the main list providers on the market. We also work with niche players and new market entrants for specific risk typologies (organized crime) or enhanced third-party screening to ensure compliance with OFAC’s 50 Percent Rule for instance.

Besides name similarity threshold, what screening parameters can I configure to meet my risk appetite?

On top of name similarity scoring, our screening approach allows you to automatically leverage secondary attributes (dates of birth, places of birth, nationalities, addresses, places of registry, flags, BICs, LEIs) for comprehensive named-entity matching. The accuracy of the matching algorithms applied to secondary attributes can be adjusted based on the risk appetite of the firms as well as the completeness and quality of the data available in the lists.

Typically, geo-based algorithms can be configured to match locations within the same region or subregion, while date-matching algorithms can detect whether two dates are within the same month, year, quadrennium or decade. Screening results systematically include all attributes and information needed to understand why a match is returned.

Screena also provides custom libraries to rate and alert on risky locations (e.g., countries, cities or regions) when screening customers and transactions, in either structured or unstructured format. Results are returned in accordance with the risk rating library configured by the customers.

How does Screena handle common names?

We combine multiple technics to increase matching accuracy for common names.

Secondary-attributes matching is one efficient method for automatically discarding a huge portion of false hits. In addition to secondary-attributes matching, our algorithms allow excluding matches for specific entity types. It is especially helpful to prevent irrelevant matches of common names such as those of individuals or organizations against vessel names (e.g. Christina or Mariana).

When secondary attributes are either missing or difficult to rely on due to data quality issues, our machine learning models have been extensively trained to overcome the challenges of names very common to specific cultures (e.g. Mohamed Ali, Liu Wei). Furthermore, threshold sensitivity is automatically recalibrated based on the detected culture. This technic delivers better results for highly challenging cultures such as Chinese, Korean or Vietnamese.

Some machine learning features have been included to apply specific weighting to common name elements of individuals (e.g., Al, Ben) or organizations (e.g., bank, international, services). We also provide a list of stopwords in multiple languages (e.g., and, or) to eliminate false positives when screening narrative fields within payments.

Screena comes out with a set of options to determine how to handle matches against short names. For instance, it is possible to systematically discard matches against single-token names (i.e. low-quality aliases) contained within full names (e.g. “Arthur Timothy Smith” matching with “Arthur”). These parameters can be differentiated per entity type and tuned based on the number of tokens being screened.

How does Screena handle inaccurate data?

Before all, let’s clarify what inaccurate data is. Inaccurate data can be caused by either human or machine errors. It takes various forms: manual data mistakes (e.g. permuted name fields), missing data entry controls, unstructured/free format fields, missing or non-standardized information in databases, incompatible formats between data processing systems, etc.

Screena systematically controls the completeness and quality of imported data. For example, Screena ensures dates are always provided in accordance with the ISO 8601 format. Likewise, countries shall always be imported in ISO 3166-alpha 2 format.

When the original data is not compliant with those standards, Screena tries to resolve it using specific normalization rules and libraries. This normalization process harmonizes and transforms data into a format that makes attribute matching consistent. Normalization libraries are enriched over time with new synonyms or alternative spellings whenever an unknown or incompatible value is provided.

Screena’s rules-based algorithms allow to tackle specific data quality issues such as typos, truncated names, out-of-order name elements, split or concatenated name elements.

Screena data model also provides distinct fields to differentiate structured and unstructured information (e.g. parsed names vs. full names, structured addresses vs. free format addresses).

Distinct algorithm parameters are actionable to handle all data quality nuances. For example, it is possible to specify how a match should be handled when one attribute associated with an algorithm is either empty or not provided. In other instances, inaccurate dates can be matched within the same year or decade. Similarly, addresses can be matched within the same region or subregion based on the United Nations geoscheme.

To achieve greater precision when screening free format fields, Screena applies advanced text analytics technics to detect distinct objects (named entities vs. addresses) within the same field and thus prevent irrelevant matches.

When it comes to matching names, when no valid culture can be determined with high certainty, Screena will call out generic machine learning models specifically trained with richer comprehensive datasets.

What methods does Screena use to reduce false positives?

Besides traditional technics such as whitelisting, the reduction of false positives is achieved with three combined methods:

  1. Secondary-attributes matching to automatically discard hits where names match but other entity attributes are incompatible. These other attributes include entity types, sexes, BICs, LEIs, dates of birth, dates of registry, date of build, places of birth, places of registry, places of build, addresses, nationalities, flags, etc.
  2. Machine learning prediction scoring based on 150+ name matching features to go way beyond the natural limitations of traditional fuzzy algorithms and increase scoring precision by analyzing numerous name characteristics altogether.
  3. Delta screening to avoid periodically regenerating the same results. When using delta screening, once all source records have been screened against all watchlist records, only deltas are considered (i.e. any new record or any update on at least one of the fields used for matching is deemed a delta).

What is the percentage of false positives reduced with the methods used by Screena?

Many screening vendors claim they deliver 90% fewer false positives. Others 70% or 95%. Without ever putting such figures in context. How many data points have been used? Was the data of good quality? How many records in total were screened? Was there any impact measured on false negatives? What was the method used to determine true positives vs false positives? Were the results consistently reviewed based on a 4-eye control principle? etc.

In other words, without any contextual element, these figures have no scientific value. It also raises the question: if almost every vendor is able to reduce false positives in dramatic proportions, then why does screening remain such a critical problem for so many organizations around the world today?

Therefore, we at Screena are very cautious and humble before communicating any sort of percentage in terms of false positives reduced. In our view, even one-digit incremental gains in terms of screening quality can have a huge impact on large-scale operations. And that’s precisely what our clients observe using our technology.

Speed and Real-Time Capabilities

Does Screena provide real-time screening capabilities?

Yes, Screena is a RESTful API built for real-time integration. We also provide endpoints for asynchronous batch processing.

What is the speed of screening?

Screena cloud-native API is built for infinite scalability using the elastic capabilities of cloud computing. Our basic cloud infrastructure is sized to process at least 500 transactions per second against global watchlists. For customers that require bigger capacities, we can scale on demand and process 10,000 searches per second. For premium customers managing millions of name records, we can upsize our cloud infrastructure to reach close to 100,000 searches per second for use cases that require such levels of performance.

Having said that, before agreeing to specific service levels, we work with our clients to put performance requirements in context – including data provisioning, business operations and financial costs.

Pricing

What is your pricing model?

Our pricing model is volume-based. Unit price decreases with higher volume levels.

What additional modules are not included in your core product pricing package?

Our core product pricing package is API-first. We provide additional UI modules for manual searches and alert remediation investigations.

Do you charge for professional services and/or after-sales services?

We only charge professional services for tailored product support and custom hosting requirements.

Do you offer bespoke solutions to clients?

Only for custom hosting requirements and machine learning recalibration based on specific data typologies.

Upload of private lists with full data encryption is included in our standard pricing package.

New high-value product requirements based on clients’ feedback are added to our product roadmap.

Solution Deployment

How can Screena be deployed?

As a standard, we host multiple clients on a single standalone environment located in France on AWS.

On demand, we can deploy Screena in all regions available from Amazon Cloud. They include the US, Africa, Asia, Canada, Europe, the Middle East, and South America. We can also deploy a private dedicated environment for a client or deploy on premises.

The system is stateless and based on multiple servers and multiple regions with a replication of the database. We also have another provider (OVH) on standby which can be used in case of failure with Amazon.

Can clients configure the solution themselves or would they need support from you?

It is possible for our clients to configure the solution entirely by themselves. As we are developer-first, we provide comprehensive API documentation with examples of requests/responses for all endpoints (https://developer.screena.ai/). We also share detailed technical guides as well as Postman collections to help our clients get started and deep dive into specific use cases. We intervene above all to ensure a smooth onboarding and a great development experience.

As we acknowledge that there is nothing like a “one-size-fits-all” solution in the field of watchlist screening, we can provide tailored support – typically during the UAT phase – to optimize Screena’s large range of option-based capabilities and adapt the API configuration to our client’s risk appetite.

Do you partner with any other solutions/vendors?

Yes, we partner with many world-class vendors as Screena API is designed to work within an open and best-of-breed enterprise technology ecosystem.

We typically partner with two main categories of vendors:

  1. Data vendors: we integrate their data feeds into Screena API and we handle the entire screening process.
  2. Transaction monitoring and fraud detection software vendors: Screena is integrated within financial crime prevention platforms as a best-of-breed screening service to offer a holistic view of customers and transactional risks.

Do you provide technology or solutions to support alert remediation investigation?

Secondary-attributes matching (e.g. date of birth matching, address matching) empowers users to automatically distinguish a true match from a false match on an alert primarily raised on matching names.

Furthermore, the response returned by Screena in case of a match also contains all additional information and evidence (e.g. link to original documents) published in the lists to investigate an alert with meaningful context as well as the underlying risk profile about a hit.

Can Screena be integrated within third-party platforms and systems?

Yes, Screena API is designed to work in integration with third-party systems.

We can either integrate Screena as a microservice on top of an existing watchlist screening solution to help reduce false positives. Such an “AI-as-a-Service” integration scenario is designed to augment legacy screening systems that have no core machine learning capabilities.

Another typical integration scenario consists in plugging Screena API as a best-of-breed screening service within a financial crime investigation platform. For instance, we work with leading transaction monitoring or investigative case management vendors that wish to quickly and easily extend their product offering to watchlist screening without the burden of in-house developments.

In both cases, Screena works as a pure backend service to either supercharge or complement third-party platforms and systems.

Security and Audit

What security measures do you apply?

We host multiple clients on a single standalone environment hosted on AWS through a single RESTful API. AWS employs a robust security program with multiple certifications including ISO 27001, ISO 27017, ISO 27018, ISO 27701, SOC 1, SOC 2, SOC 3, PCI DSS (https://aws.amazon.com/compliance/programs/).

The system that hosts Screena is based on ARM Linux system. All disks are encrypted. We use Amazon API Gateway and Amazon Load Balancer with WAF to distribute the requests to the system. No administration API is exposed externally. Administration APIs are only accessible from the internal network. The production operating systems are updated on a daily basis for security updates.

All the code produced for the core application and associated services adheres to the OWASP guidelines and recommendations to prevent common security issues such as cross-site scripting (XSS) or SQL injections. Every code change is committed, signed, and tracked in a versioning system.

During the development phase of the application, an automatic audit of security is done using Gitlab and Sonar tools, and reviewed before each release.

We have a full weekly backup and an hourly snapshot of the environment.

Multiple logging systems are in place to detect unauthorized access to the system. We use the regular logging from Amazon CloudTrail, CloudWatch, but also internally within the application where each API request is logged.

We don’t store or keep customers’ personal data sent through Screena search endpoint (POST rest/v2/dataset-search-engine). We only log and count the number of API requests executed monthly for billing purposes.

Data is kept anonymous at all times as we encrypt data in transit, in compliance with AES-256 SHA 512.

To use the Screena API, an API key is mandatory. We provide one API key, unique to each client, on a one-off basis. The API key is a unique encrypted identifier that is used to authenticate requests for usage and billing purposes. Since the API key itself is an identity by which to identify the application or the user, it is unique, random and non-guessable. Each API request shall always be associated with an API key.

Does Screena provide versioning capabilities?

Yes, every version of a watchlist record is kept in memory and corresponds to a revision numeric value. In addition to the revision value, every record is tagged with its creation date/time and updated date/time. Consequently, you can have access to any event affecting a record including:

  • the current revision (i.e. most recent published version),
  • any corrections or enhancements,
  • and any other changes made to the record in question.

 

The first version of a record is always numbered as revision number 1.

We also keep in memory the date/time when a record was last matched to manage deltas as needed for periodic screening.

 

Resources