More and more financial institutes are considering adopting LLM models for their systems. However, the struggle from a small MVP / POC to industrialization is non-negligible.
Large LLM models such as LLama2-13B or LLama2-70B require a significant investment to run on your production infrastructure. These models are excellent all-rounders but come with high hardware costs due to their billions of parameters, allowing them to dominate multiple areas simultaneously.
However, more and more so-called Tiny LLMs are published on the internet. Often coming with a fraction of parameters such as the Tiny LLama2-1.1B.
Those Tiny LLMs run on almost any infrastructure and enable innovation in various use cases where the adoption of huge LLMs was impossible beforehand.
The Tiny LLMs are specialized in specific domains, which limits their versatility and generalization capabilities, but let me explain it in more detail.
This story is part of a three-week sequence:
Introduction of TinyLLM in Finance: The first story of the sequence introduces TinyLLMs and their importance in finance.
TinyLLM in Action: The second story of the sequence will introduce a use case of implementing a TinyLLM on your local laptop or PC.
Which LLM to choose for GenAI in Finance: The last story of the sequence will discuss several LLMs (Proprietary and Open-Source) and the decision-making process.
I. What are Tiny LLMs?
Tiny Language Models (Tiny LLMs) prioritize efficiency and specialized functionality over scale, representing a paradigm shift in AI development rather than just miniaturized versions of existing models. Positive side effects: Lower carbon and water footprint, aligning with the industry's shift towards sustainable AI solutions.
Besides that, their efficiency in training and inference makes them more accessible and practical for use in a broader range of devices, including those with limited resources like mobile phones.
One example of such Tiny LLM is the TinyLlama-1.1B.
The TinyLlama project aims to pretrain a 1.1B Llama model on 3 trillion tokens. With some proper optimization, we can achieve this within a span of "just" 90 days using 16 A100-40G GPUs 🚀🚀. The training has started on 2023-09-01.
We adopted exactly the same architecture and tokenizer as Llama 2. This means TinyLlama can be plugged and played in many open-source projects built upon Llama. Besides, TinyLlama is compact with only 1.1B parameters. This compactness allows it to cater to a multitude of applications demanding a restricted computation and memory footprint - Source: https://github.com/jzhang38/TinyLlama
Other Tiny/Small LLM are:
Vicuna: Vicuna is an open-source chatbot trained on user conversations from ShareGPT. It can respond to various prompts and is suitable for customer service, education, and entertainment purposes. Learn more about Vicuna
Koala: Koala is a dialogue model that is both affordable and highly effective. It has been trained on a combination of public datasets and large language model dialogues, resulting in exceptional capabilities for text generation, language translation, creative content writing, and providing informative responses. Additionally, Koala is equipped with impressive functionality that allows it to follow instructions and complete tasks with ease. Explore more about Koala
Alpaca: Alpaca is an AI model currently being developed, which is trained on data generated by OpenAI's text-davinci-003. It has the ability to follow instructions and carry out tasks such as writing emails, creating presentations, and managing social media accounts. Once completed, Alpaca is expected to be a valuable tool for researchers and developers. Discover more about Alpaca
II. Impact of Tiny LLMs in Finance
All those previously mentioned reasons and explanations make Tiny LLMs important players in Finance. These models, distinguished by their relatively small size, are optimized for a limited number of tasks, are efficient, and are powerful, making them well-suited for various financial applications.
At their core, Tiny LLMs are streamlined versions of their larger counterparts. They possess fewer parameters, which translates to less computational demand. However, this reduction in size and complexity does not diminish their effectiveness to certain degrees.
With the launch of Gemini and the news dropped on 6th December 2023, Google has marked a new shift, Nano LLMs. An LLM which should be able to run on a smartphone, yes a smartphone! We see clearly a shift from generalized huge LLMs such as GPT3.5 or GPT-4 towards tiny or nano LLMs to be specialized but less generalized - Source: https://blog.google/products/pixel/pixel-feature-drop-december-2023/
Besides the new release of Google, Tiny LLMs have several amazing capabilities:
1. Efficiency
Financial institutions, particularly smaller ones with limited IT budgets, find these models appealing because they require less computational resources. This aspect is crucial in an industry where cost savings can directly impact competitiveness and profitability. Adopting Tiny LLMs allows these institutions to leverage GenAI capabilities without the big investment typically associated with larger models.
Quick Training
Cost-effective deployment
2. Accessibility
Accessibility and hardware compatibility are additional benefits. Unlike their larger counterparts that often need specialized hardware like powerful GPUs, Tiny LLMs can run on more modest, easily accessible hardware. This feature democratizes access to cutting-edge AI technologies, allowing a broader range of financial players to implement sophisticated data analysis and customer service solutions.
Flexible Hardware Requirements
Lower Data Requirements
3. Functionality
Moreover, Tiny LLMs are making a mark in customer service by powering automated chatbots and virtual assistants. These AI-driven tools can handle many customer queries, from simple account inquiries to more complex financial advice. The models’ ability to understand and process natural language enables them to provide accurate and contextually relevant responses, enhancing the customer experience.
Questioning
Translation
Text Generation
Coding
Tiny LLMs represent a significant advancement in financial technology. By offering cost-effective, accessible, and versatile AI solutions, they enable financial institutions to harness the power of advanced data analytics and AI, regardless of their size or IT capabilities.
However, remember that those models are less effective than their original versions with many billion parameters. It is always a compromise.
Why should we care, and what use Cases exist?
The idea of using a 1.1B parameter model is motivating for several reasons.
Generally, a model with less than 3B parameters can run without Cuda support as long as you have enough RAM
Intelligent engineers and AI enthusiasts decided to quantify these models. So basically, anyone can run them.
If you are a small financial institute and do not have access to high-end clusters with GPUs, then these TinyLLMs are probably something for you.
Use Cases of Small LLMs
As small LLMs evolve, they're paving the way for more innovative and unique applications. Here are a few compelling use cases:
Business Intelligence: Small LLMs can provide data-driven insights to help businesses make informed decisions, improving efficiency, profitability, and customer satisfaction.
Customer Service: Developing multilingual chatbots with small language learning models (LLMs) can enhance customer service by providing round-the-clock support and decreasing the cost of human customer service representatives.
Education: LLMs can simplify concept learning, enhancing accessibility, affordability, and personalization. For example, develop an internal education platform to get information and support for specific finance-related problems and tasks.
Research: LLMs can speed up research and uncover new insights from big data.
Small LLMs are helping organizations do more with less by automating tasks, personalizing customer experiences, enabling informed decision-making, and promoting innovation.
Closing Thoughts and Next Steps
This article explores the impact of TinyLLMs in Finance. It emphasizes its opportunities for individual developers, startups, and mid-sized businesses. The article also analyzes this trend's transformative influence on the GenAI and technology sphere.
If you want to delve deeper into the topic of AI democratization and Large Language Models, the recommended readings below:
Decentralizing the Power of Data: A Financial Paradigm Shift in 2023
In the digital age, data has often been likened to oil, a valuable resource that powers much of our modern world. However, unlike oil, data is ever-expanding, with the World Economic Forum estimating that data creation would surge to a staggering 94 Zettabytes in 2022, a significant leap from 74 Zettabytes in 2021. But raw data, much…
Generative AI in Finance
In the dynamic world of finance, where numbers dance and markets pulse with life, a new trio of revolutionary forces is emerging on the horizon – Generative AI, Explainable AI, and Responsible AI. These aren’t just buzzwords; they are the facts of a new financial era, sculpting a landscape where innovation meets transparency, and…