Market data giant Bloomberg is set to capitalise on the craze for all things AI by building a 50-billion parameter large language model for finance, dubbed BloombergGPT.
Bloomberg has released a research paper detailing the development of BloombergGPT, which has been specifically trained on a wide range of financial data to support a diverse set of natural language processing (NLP) tasks within the financial industry.
The firm says the model will assist in improving existing financial NLP tasks, such as sentiment analysis, named entity recognition, news classification, and question answering, while unlocking opportunities for Terminal clients to marshall the vast quantities of data flowing through the market.
To build BloombergGPT, the firm's engineers pulled from an extensive archive of 40-years of financial data to create a comprehensive 363 billion token, or word fragement, dataset consisting of English financial documents.
This inhouse data was augmented with a 345 billion token public dataset scraped from the likes of YouTube and Wikipedia to create a large training corpus with over 700 billion tokens.
For context, OpenAI's 2020 release of ChatGPT was trained on 500 million tokens.
Using just a portion of this training corpus, the Bloomberg team trained a 50-billion parameter decoder-only causal language model.
“The quality of machine learning and NLP models comes down to the data you put into them,” explains Gideon Mann, head of Bloomberg’s ML product and research team. “Thanks to the collection of financial documents Bloomberg has curated over four decades, we were able to carefully create a large and clean, domain-specific dataset to train a LLM that is best suited for financial use cases."
He says the BloombergGPT model outperforms existing open models of a similar size on financial tasks by large margins, while still performing on par or better on general NLP benchmarks.
“For all the reasons generative LLMs are attractive - few-shot learning, text generation, conversational systems, etc. - we see tremendous value in having developed the first LLM focused on the financial domain,” says Shawn Edwards, Bloomberg’s chief technology officer. “BloombergGPT will enable us to tackle many new types of applications, while it delivers much higher performance out-of-the-box than custom models for each application, at a faster time-to-market.”