Join the Community

24,192

Expert opinions

40,760

Total members

345

New members (last 30 days)

206

New opinions (last 30 days)

29,302

Total comments

Join Sign in

Deconstructing the Data Dilemma in Banks: Banks have a false sense of having data

06 October 2023 1 comment

Dmitriy Wolkenstein

CEO

TIMVERO

Hi all,

It's Dmitriy from TIMVERO.

Today we are starting our series of articles about data usage, AI/ ML modeling, and analytics in banking.

First one is going to be about issues with data availability. So, let’s start.

In today’s banking sector, institutions may find themselves perched upon a false sense of data availability. Often, they may believe they possess ample data for loan portfolio analysis processes, decision-making, and machine learning (ML) models when in reality, a significant portion of this data remains inaccessible or in a format incompatible with their systems. This common misconception is a critical roadblock to improved data-driven strategies and predictive analytics.

Problem: The Illusion of Abundant Data

When banks receive data into their internal production systems, it typically arrives in a ‘raw’ format, such as XML or JSON data. This format is inherently non-tabular and, therefore, not immediately usable for most analytical or ML usecases as the vast majority of ML models and risk analysis tools require input data to be in a tabular format. Furthermore, the data must maintain consistency over time. A system that retains specific data for only a month and lacks it for historic periods is useless for these models.

Unfortunately, the data that enters production systems is often not saved fully — only some parts used by the production systems are retained, while the rest can be discarded.

This situation can occur due to miscommunication between IT infrastructure teams and data scientists or risk analysts. In more forgiving circumstances, some of the data response may be parsed and stored in a format usable for later analytical processing while the remaining data is relegated to a server, often lost to the data science team.

This mismatch between data formats and data availability is limiting banks ability to improve models without expending significant time and effort to retrieve additional data.

Solution: Harnessing Integrated Data Transformation Tools

To combat this issue, banks can utilize integrated data transformation tools to convert incoming data into a more usable format. These tools should be easily accessible to ML and risk teams, thereby lowering the costs associated with creating new features. Ideally, they should be separated from underwriting tools to enable the bank to store more features than it uses in production. This separation facilitates the pushing of extra data into the machine to find new patterns, enhancing the capability to adapt and evolve models as needed.

Pros and Challenges

The implementation of integrated data transformation tools provides several distinct benefits. First, it eliminates the barrier between the data and those who use it, simplifying the process of generating new features. It also allows for more substantial data storage, leading to a richer data environment and, consequently, more robust models.

Moreover, this approach enables banks to take full advantage of their data by utilizing ML and risk teams to find and leverage new patterns. In turn, this enhances predictive analytics capabilities and improves overall business intelligence.

However, there are challenges associated with the integration of data transformation tools. For one, the transformation process may initially be time-consuming and tedious, requiring banks to invest in training and tool development. Furthermore, there may be resistance from different teams within the organization to change established processes and adapt to this new system. The separation of these tools from existing underwriting tools may also add to the complexity, potentially leading to initial hiccups in the integration process.

Despite these challenges, the benefits of data transformation tools are considerable. As the banking industry continues to recognize the value of data-driven strategies, these tools provide an effective solution to the prevailing data illusion problem, leading to improved decision-making, risk management, and ultimately, superior customer service and loan portfolio analysis.

The next two articles are about Feature Store Technology and Data Censoring in Credit Analytics.

Stay tuned.

External

This content is provided by an external author without editing by Finextra. It expresses the views and opinions of the author.

5288

Report

Channels

/retail banking /wholesale banking

Big Data

this Group Focusing Big Data Area and Data Engineering /Data Warehouse /Data Analytics

Join group

31 opinions 5 members 05 October 2025

Comments: (1)

Ketharaman Swaminathan Founder and CEO at GTM360 Marketing Solutions

09 October 2023

I've lost count of the number of times I've been hearing this narrative in the last 10+ years.

If this is really the state of data at banks, any idea what the heck have Analytics / BI / DW giants like Business Objects, Cognos, SAS, Cardlytics, Datadog, Snowflake, and dozens of others been selling to banks all these years? Snake oil??

Report

Dmitriy Wolkenstein

CEO

TIMVERO

Member since

25 Sep 2023

Location

London

More expert opinions

Alex Kreger Founder and CEO at UXDA Financial UX Design