Sarasota, FL, March 18, 2015 / By: Justin McDonald, The Fraud Practice LLC
Custom modeling and analytics is an advanced risk management technique that utilizes organization-specific data to identify trends and evaluate the risk of future transactions by use of statistical formulas or models. Advancements in data science, machine learning and technology have made custom modeling solutions more affordable and attainable, and organizations have benefitted from the increased availability of such services in the marketplace. This is particularly true for merchants and other mid-sized organizations that may not have the resources to build and manage custom modeling and analytics entirely in-house but now have more options for buying partial to complete modeling solutions.
Whether an organization is building a custom modeling solution in-house, using a service provider or combining both in-house and third party resources, the fundamental components of an effective custom modeling solution are the same. Statistical models must first be created, which requires historical data, a team of modeling experts, as well as the right tools and software to design effective models. Next, the organization will need the infrastructure or platform to actually apply this model to live transactions, interpret the results and route the transactions accordingly. A commonly observed problem in the market, however, is that organizations put forth such great effort in ensuring the statistical models are accurate predictors of risk that the next step, how these models are actually deployed, is often overlooked or just an afterthought.
This isn’t to say that model design is not a critical step. What’s the good in efficiently deploying custom models if they are not effective at distinguishing fraudulent from legitimate transactions? But organizations must also consider the other side of the coin: even if a custom model was accurate at predicting fraud most all of the time, it is of no benefit unless it can be applied to transactions, meaning the transactional and customer data can be fed to the models and the results can be interpreted to decide the course of action for each order.
Deployment is the second major step in executing custom models after model design, but is at least equally as important of a step.
This statement is true for in-house, outsourced and hybrid custom modeling solutions, but for the context of this article the focus is on assessing deployment features and capabilities when shopping vendors. For organizations that have or plan to have in-house custom modeling and deployment, prioritize the following capabilities and considerations to build accordingly.
First let’s be clear on what is meant by model deployment. At this stage the custom statistical models already exist, but deployment refers to how these models are actually leveraged. It may be best to provide a simplified example. First, think of a statistical model as a formula. The formula takes into account many variables, often hundreds or thousands, and applies coefficients, or weights, to each variable. Deciding which variables to include and what weights to apply are examples of how models are designed.
To deploy a custom model is to run it on a platform that can produce or calculate all of the variables the model needs, then after feeding the model the data needed to provide the predictive outcome or score, the platform must execute a decision (Approve/Decline/Review) contingent on the results. Deployment refers to the infrastructure and processes required to apply custom models to live transactions and subsequently route these transactions accordingly. Below are 7 important considerations around model deployment that all organizations currently using, planning to use or considering custom modeling should keep in mind.
Feeding the Model The deployment platform can be thought of as a hub that is connected to multiple data bases, third party services and data sources providing all the needed information or variables that feed the models. On a more basic level variables can be binary, such as whether or not shipping and billing addresses match or if the payment card number being used is on a blacklist. The variables that a model relies on can also be more complex. For example, a model may call for variables such as the distance between billing and shipping addresses or a velocity of change count like the number of different payment cards associated with the same email address. The model must be provided the distance in miles or kilometers, or the velocity count number, it is not expected to make the prerequisite calculations. Often models rely on very complex variables that may be correlated or contain sophisticated aggregations with respect to many data points. The responsibility of providing these complex variables falls on the deployment platform, either to make these calculations or pull this information from another source and feed it to the model.
In other words, the platform provides a means for accessing other fraud prevention techniques or services and is responsible for feeding the models all of the required data. This includes third party tools or services that may be in use, such as device identification, as well as techniques that can be applied in-house, like velocity checks.
When building or buying the platform or deployment component of a custom modeling solution, one of the first considerations should be what data is needed to feed the models and how the platform is going to provide it. This is important to consider not only in the context of what is needed for current model designs, but also in terms of what data or capabilities may be desired in the future to feed and improve models. Any limitations of the platform to provide data in the needed format or with needed calculations can impact the ability to adjust and improve models going forward. Additionally, if adding new risk management techniques or third party services to a risk management strategy in the future, organizations will want to be sure the platform can interact with and pull data to feed models from these new sources as well.
Is the Platform Able to Support Batch and/or Real-Time Processing?
An important consideration related to feeding custom models is how quickly this data needs to be provided. Custom modeling solutions can be delivered in real-time or via batch processing, and the deployment platform needs to support the desired method. When custom models run in real-time the outcome or risk score for each transaction can be provided almost instantaneously once submitted, as opposed to holding and submitting batches of transactions to run through the custom models at set intervals. To support real-time custom modeling, the platform needs to be able to provide all the needed variables and calculations in real-time as well. When thinking about what capabilities are needed for the platform to successfully feed models with all required data elements, it is also important to consider how quickly this must be provided to support a real-time custom modeling solution.
How Quickly can Models be Deployed? It’s not only important to consider how quickly a custom modeling solution can assess each transaction as it comes in, but also how quickly new or refreshed models can be deployed. The platform not only needs to feed models and execute procedures based on model results, it also needs to manage when models are activated, inactivated or replaced. Often organizations overlook this important aspect of deployment.
The ability to adapt is an important aspect of an effective risk management strategy, and this includes the ability to implement changes quickly. When shopping vendor custom modeling platforms and deployment options, consider how often and how quickly changes to models can be made.
Consider how long it takes from when a new or updated model is submitted to when live transactions are being run against that model. Are new or updated models deployed in the middle of night when volume is low, or does the deployment process begin immediately once the model is submitted? When urgent changes need to be made to a model it is important for organizations to be able to apply these changes quickly such that losses and exposure time can be minimized. Whether built in-house or provided by a third party, organizations should have access to an integrated platform that can swiftly and efficiently enables model deployment.
Organizations must make decisions rapidly and apply them continuously to maintain risk levels under control. We see fast design of custom models and their immediate deployment as critical components of an effective fraud detection solution. Access to a fully integrated platform for the development of multiple model packages and streamlined deployment of the models into execution is paramount. The ability to support complex models as well as deploy them immediately maintains smooth and effective fraud prevention while enabling a frictionless consumer experience. Cristina Soviany, CEO, Features Analytics
Hosting the Platform Whether building or buying a platform an organization must also consider how they would like to have access to it. There are three primary ways an organization can access their modeling platform: the platform is hosted on local servers, the platform is cloud hosted, or the platform is accessed via an API. Local hosted platforms are most common for homegrown platforms while API access and cloud hosted platforms are common options when buying access to a platform, although several third party providers offer local hosting as well.
Organizations must decide which hosting method is best for their wants and needs. The benefit of cloud hosting and API access is that these methods are typically quicker to get up and running and that they require little ongoing administrative costs. There is overhead with hosting and maintaining local servers, but an advantage is that data can be processed more quickly compared to a cloud hosted platform where data must be encrypted and transmitted. If a vendor offers both local and cloud hosted platform access dig deeper and see if there are any differences in the service license agreements (SLAs) between the two implementation options. Ensure that the desired hosting option can also support a real-time solution if this is a priority.
Can the Platform Support Your Volume?
Considering hosting options and SLAs also brings up another important factor: the platforms ability to support an organization’s volume. Many commercial solutions support even the highest volume merchants, but it is important to ensure this before committing to any vendor or implementation option. Consider not only current volume, but how the custom modeling solution and platform will scale as volume grows. As a general best practice organizations should ensure that a modeling platform provider can handle their peak volume and that vendors are contractually obligated to meet promised SLAs. These SLAs should cover the number of transactions that can be processed per second (TPS) and the response time per transaction.
Can the Platform Support Multiple Models? Another consideration with deployment is the platform’s ability to run multiple models or segmented models. More sophisticated custom modeling solutions often rely on a series of models that are applied to different populations or sets of transactions based on order or customer characteristics. A common example is regional models, where domestic orders run through one model and a separate model is applied to international orders. It is also fairly common to have different models for new versus returning customers that have completed good orders in the past.
If making use of multiple models then organizations must make sure it is something their platform can support as well. To successfully deploy a custom modeling solution with segmented models the platform must first determine what model or set of models each transaction or customer is routed to. The platform must have the capability to support the logic to determine the presence of certain characteristics that dictate which set of models are applied to each transaction.
To support multiple models the platform must have this transaction routing capability. Often multiple models can be nested, such that the outcome of the first model applied dictates which model is applied next. This is an iterated form of transaction routing capabilities, and if these are features of the model design then the platform must be able to support such features.
Does the Platform Support A/B or Shadow Testing? Another feature that custom modeling solutions may support is the ability of the platform to perform A/B testing on live transactions, or use “shadow testing” to see how a model is performing without it impacting how a transaction is handled. These are great features to see which models perform better and ensure models are accurately detecting fraud before fully deploying them. While great features for refining model design, these are somewhat more advanced features with regards to deployment.
Supporting A/B testing is similar to supporting segmented models in that different models are applied to different transactions. The difference here is that A/B testing should be random rather than choosing which model to present based on transaction characteristics. For a platform that supports multiple models, adding A/B testing is an extra step for the platform. To provide an example, take a merchant that is testing two versions of a model that will be applied to first time domestic customers. First the platform identifies which model type will be applied to the order, but now that the organizations is A/B testing two versions, the platform will need to randomly apply model version A on half of these transactions and model version B on the other half.
With shadow testing the platform is effectively running two models simultaneously on a transaction. The existing model, already in production, is applied to the transaction and dictates how the transaction is routed. The model that is being tested is also run against the transaction, only the score or result of the model is stored in a separate database with the unique transaction identifier, but does not impact the order. This score is compared with that of the original model, as well as the actual outcome for the transaction after a few months when this can be confirmed.
From a model design and maintenance perspective, A/B and shadow testing are great features to have, but it is something that must be supported by the platform. For organizations that build and maintain custom models in-house, if these are features the statisticians and data scientists deem important than these are features that the organization will want to be sure the platform can facilitate.
Conclusions Organizations have more options today when it comes to custom modeling as there are options in the market to buy partial to complete solutions. Whether a merchant designs their own custom models and uses a third party platform deployment, outsources model design and maintenance but deploys it on a homegrown platform, builds both or buys both, there are many considerations when it comes to model deployment.
Limitations to the deployment platform can restrict the variables and data that models leverage to detect fraud, resulting in reduced ability for the model to accurately estimate the likelihood of fraud. When it comes to assessing a modeling platform there are considerations that range from the platforms ability to support peak volume, how and where the platform will be hosted, to the ability to support multiple models, testing models and quick deployment of model changes.
Limitations to the modeling platform translate to limitations in the custom modeling solution overall, and it can be frustrating to have features of a well-designed model that cannot be used due to a platform that cannot fully support it. This is why the modeling platform and deployment are just as important as the design and accuracy of the model, if not more so.