Community
Our BizDev team at work asked me to explain why Data Management developed different cultures and languages to that of Data Science. I found this hard, in part because some of my background comes from engineering maths, which, for many years, was tightly coupled with the arguably adjacent Financial Services discipline of Quant, or Financial Engineering, terms which my work colleagues rarely engage with.
This matters because Quant and Financial Engineering, like Data Management, preceded modern data science by decades. Indeed, Data Science became a thing only in the mid 2010s. I've argued before that Quants are the original data scientists. Given the arrival of DeepSeek, essentially a firm populated by and supported by Quants, Quants have pedigree as modern AI engineers too.
So this Opinion ties together three distinct but interelated worlds:
I will also make a case, one which doesn't involve Moore's Law or cheap data storage, important though both were, why:
To help our BD team, I sketched this pre data science timeline blending the three disciplines, and added pictures of Ne-Yo, Pink & Taylor Swift which I will explain later.
For folks like me without tabular backgrounds, here's a brief summary of the history of tables, and how they align to Quant, Data Management and analytics teams.
đź“ś A Brief History of Tables in Computing:
✅ Pre-20th Century: Mathematicians and statisticians used tables to categorize and organize data, as early scientific records, accounting ledgers, and statistical tables. ✅ 1950s–1960s: Early computing structured data in punched-card systems (IBM) and hierarchical databases like IMS (1966), but they lacked the flexibility of relational tables. ✅ 1970: Edgar F. Codd brought the relational model to data storage, introducing tables (“relations”) to modern databases and data management. ✅ 1974–1979: Relational databases (IBM System R, Oracle) used structured tables for enterprise computing. ✅ 1976–1993: Programming languages embraced tabular data: The SAS Programming Language introduced structured data step tables. The R Programming Language (1993) used data frames — essentially tables.
SPlus, commercially supported and based on R, was popular in Quantitative Finance in the late 1990s, while SAS prevailed in enterprise risk analytics, credit risk and risk-based decisioning. All were popular in university statistics departments, and in decision sciences teams in biotech, pharmaceutical and chemical organizations.
Meanwhile, my matrix-based language, MATLAB, prevailed in Financial Engineering and Quantitative Research, particularly for option and derivative pricing, and for prototyping, and in production too, on then emerging proprietary trading desks in capital markets.
Why? Well these teams employed matrix algebra-literate engineers and applied physicists, while risk and analytics functions tended to hire table-familiar statisticians and mathematicians. Some departments featured both, e.g. buy-side portfolio research teams, or econometricians. This meant good-natured battles between statisticians highlighting table convenience and engineers highlighting matrix computing power. I use the word power because matrixes performed well for compute-intense operations, e.g. Principal Components Analysis, regressions, simulation, neural networks/AI, optimization, time-series operations, and much much more.
Therefore, matrix algebra quant applications included:
What Happened in and after 2008?
Ne-Yo’s up-tempo melodic song, Closer with follow-up Miss Independent, alongside Pink, at her musical peak, and Taylor Swift, still singing Country, dominated the pop charts. The credit crunch hurt. Its regulatory impact will make a brief appearance at the end of this opinion.
However Wes McKinney, a hedge fund data engineer-come-quant at AQR Capital Management introduced the open source tabular-based pandas (Python Data Analysis) library to the Python programming language.
Python long preceded McKinney's pandas. A functional language, it originated in the early 1990s, becoming popular for unit testing scripts. Only when Travis Oliphant, who appreciated Python’s simple, understandable programming language, delivered SciPy in 2001 and NumPy in 2005 did it enter mathematics and engineering, leveraging matrix algebra libraries like MATLAB had prior.
In 2008, however, Wes McKinney brought pandas to Python, and thus tabular convenience to the matrix libraries of Travis Oliphant's NumPy and SciPy.
Now data science could take full effect, with tables and matrices in one unified open source programming language, Python servicing statisticians, data engineers, quants and financial engineers. New tools drove community growth further, e.g., reproducible Jupyter notebooks, scikit-learn for machine learning, and PyTorch, Keras, Tensorflow, and other deep learning libraries driving the new transformer technologies that underpin modern AI and LLMs.
Data Science Unifies in the 2010s with Pandas
Fast forward to 2025.
With vector databases, graph structures, and AI-driven data processing, will tables remain so influential?
Well, matrixes and vectors will continue to power the engine of AI. However, as someone working with graph technologies, I see contextual benefits relationships of graphs, built on matrix algebra (as sparse matrices) and evolving the convienience of tables. Quoting Tony Searle, the so-called Knowledge Graph Guy, "a customer isn’t just a database row; they’re linked to past purchases, support tickets, email exchanges, written notes, social sentiment, and pricing preferences. An insurance claim isn’t just an entry - it’s tied to policy details, vehicle history, repair records, and similar cases. This isn’t about storage - it’s about making sense of complexity at a scale that rigid databases and APIs simply can’t match." I agree.
Yet revitalized by Parquet, Arrow and Iceberg formats underpinning the so-called lakehouse and new streaming analytics ecosystems, tables are here to stay too.
We in FinTech have much to celebrate in driving and governing AI.
This content is provided by an external author without editing by Finextra. It expresses the views and opinions of the author.
Kristine Jakovleva Chief Marketing Officer at Advapay
17 February
Taras Boyko Founder at BTG Corporate Services Provider
14 February
Rolands Selakovs Founder at avoided.io
Sergei Grechkin Chief Risk Officer at AIFM Cayros Capital
Welcome to Finextra. We use cookies to help us to deliver our services. You may change your preferences at our Cookie Centre.
Please read our Privacy Policy.