TinyLLM in Action in Finance

Story 2/3: How to implement an LLM or TinyLLM within your infrastructure even on a simple PC.

Christophe Atten

Dec 18, 2023

A vibrant and colorful scene depicting the concept of 'TinyLLM in Action in Finance'. The image features a futuristic, sleek finance office with large, transparent holographic displays showing stock market graphs and financial data. In the center, a group of diverse professionals are gathered around a table, intently discussing strategies. The professionals include a Caucasian woman, a Black man, and an Asian woman, all dressed in smart business attire. The office has a modern design with sleek furniture and advanced technology, highlighting a cutting-edge financial environment.

This story is part of a three-week sequence. This time TinyLLM in Action, Part 2/3:

Introduction of TinyLLM in Finance: The first story of the sequence introduces TinyLLMs and their importance in finance.
TinyLLM in Action: The second story of the sequence will introduce a use case of implementing a TinyLLM on your local laptop or PC.
Which LLM to choose for GenAI in Finance: The last story of the sequence will discuss several LLMs (Proprietary and Open-Source) and the decision-making process.

I. Explanation of LLM, GenAI and TinyLLM

Source: Source: https://github.com/jzhang38/TinyLlama

LLM (such as GPT-4 used for ChatGPT) stands for Large Language Model, which is a type of Generative AI specifically designed to understand and generate human language. Think of it as a knowledgeable friend who has read an enormous library and can chat about almost any topic, write essays, or even create poems. LLMs learn from the text on the internet, books, and other written materials to understand how language works and respond appropriately to written prompts, making them highly versatile in handling various language-based tasks.

But what is Generative AI? Generative AI refers to artificial intelligence that can create content. It's like a virtual artist with a limitless imagination, able to craft images, write stories, or compose music. Unlike traditional AI, which analyzes and sorts information, generative AI produces new and original material, learning from vast amounts of data how to mimic various styles and formats. It's like teaching a computer to paint like an artist by showing it thousands of paintings and then asking it to make its artwork.

Tiny Language Models (Tiny LLMs) prioritize efficiency and specialized functionality over scale, representing a paradigm shift in AI development rather than just miniaturized versions of existing models. Positive side effects: Lower carbon and water footprint, aligning with the industry's shift towards sustainable AI solutions.

Examples are Vicuna, Koala, Aplaca, and TinyLlama.

Introduction of TinyLLM in Finance

Christophe Atten

December 12, 2023

More and more financial institutes are considering adopting LLM models for their systems. However, the struggle from a small MVP / POC to industrialization is non-negligible. Large LLM models such as LLama2-13B or LLama2-70B require a significant investment to run on your production infrastructure. These models are excellent all-rounders but come with hi…

Read full story

II. Motivation to deploy LLMs or TinyLLMs in Finance

The overall motivation for the deployment of LLMs or TinyLLMs in finance can be reduced to three main parts:

Efficiency
Accessibility
Functionality

I've already discussed this in the first part of my article, “Introduction of TinyLLM in Finance” but more generally. This time, I will give you some examples and use cases in action.

All use cases are related to the fact that no private information may leave the financial institution for security reasons and because of offline use.

Use case #1: A Developer buddy in the form of an LLM / TinyLLM

If you are a developer, you will understand the benefits of such a tool. Imagine the following:

You are developing a new Python algorithm, Java backend code, or something else and need to write the code documentation, the comments within the code, and so on. You could upload your code to the LLM tool to write your documentation. If you are already one step ahead, you could use an LLM agent that checks one code file at a time and finishes the work "almost" without human interaction.

Besides that, there are also more straightforward use cases:

Code translation, such as Java to Python, Python to vb.net, etc…
Code generation, such as “Write me a quick sort algorithm in Python” → Will be demonstrated below in the TinyLLM in Action use case.

Interesting source:

LLM Agent: https://www.ionio.ai/blog/what-is-llm-agent-ultimate-guide-to-llm-agent-with-technical-breakdown

Use case #2: Offline translator fine-tuned on corporate mail

If, like me, you live in a multilingual country where I grew up speaking Luxembourgish as my mother tongue, learned German and French at primary school, and then English at high school. You will understand that you need a translator. You know all the languages, but sometimes your brain is so muddled that you need extra help, or maybe one of those languages is just much stronger than the other.

Deepl is a wonderful online tool, but it is online and carries the risk of data loss. So what is the offline solution? Either an LLM or TinyLLM deployed onsite in your infrastructure to translate English into French, for example. If you use TinyLLM, you focus from one language to another to be efficient. If you use LLM like Llama2-13B or even Llama2-70b, you have a great model that understands multiple languages at once.

III. Preparation and Setup

In order to have a secure programming environment with python you have to follow a few initial preparation steps:

Certainly! Here's a structured text focusing on Parts 1, 2, and 3 of the preparation and setup steps needed for secure programming with Python, ideal for an article format:

Secure Programming with Python: Initial Setup and Preparation

1. Setting Up a Secure Workspace

Creating a Dedicated Project Folder:
The first step in starting a secure Python project is to create a dedicated folder for your project. This is essential for maintaining a structured and organized workspace. It's not just about having a place for your code; it's about creating an environment where everything related to your project lives.

Integrating Version Control:
Once your project folder is ready, the next step is to integrate it with a version control system, such as Git. This isn't just about tracking changes; it's about ensuring the integrity and security of your codebase. Version control allows you to collaborate with others securely, track the history of your project, and revert to previous states if needed. Some helpful source: https://git-scm.com/docs/gittutorial

2. Using a Virtual Environment

Creating an Isolated Environment with venv:
Python's built-in venv module allows you to create an isolated environment for your project. This is crucial for managing dependencies without affecting other projects or the global Python environment. To create a virtual environment, navigate to your project folder and run:

# WINDOWS
venv\Scripts\activate # Windows
python -m venv venv

# Unix / Mac
python3.10 -m venv venv
source venv/bin/active

Installing Project Dependencies:
With the virtual environment activated, use pip to install the packages your project requires. This ensures that all dependencies are contained within your project, preventing conflicts and maintaining a clean global environment.

For this project, we need the following packages:

# Packages
langchain==0.0.350
ctransformers== 0.2.27

3. Dependency Management

Staying Secure with Updated Dependencies:
One of the most overlooked aspects of secure programming is dependency management. Regularly update your dependencies to patch any known vulnerabilities. Tools like pip-audit can be used to scan your dependencies for known vulnerabilities.

Creating a requirements.txt File:
Maintain a requirements.txt file to keep track of your project's dependencies. Use the pip freeze command to generate this file. This helps set up the project on different machines and plays a crucial role in identifying and managing dependencies securely.

IV. Implementation

In this demonstration of LLM and TinyLLM, I will show how you can use them for coding and translation. The implementation will follow:

Development of the LLM and TinyLLM code within a Anaconda3 Jupyter notebook for translation and coding.
Testing of the code

For coding, we will use the LLM model Mistral-7b and for translation TinyLLama-1B.

Implementation of the Mistral-7b with Python and CPU usage only

from langchain.chains import LLMChain
from langchain.llms import CTransformers
from langchain.prompts import PromptTemplate
import time
from ctransformers import AutoModelForCausalLM,AutoConfig


llm = CTransformers(model="TheBloke/Mistral-7B-Instruct-v0.1-GGUF", 
                    model_file="mistral-7b-instruct-v0.1.Q4_K_M.gguf", 
                    config=config)
prompt = PromptTemplate(template=TEMPLATE, input_variables=["user_input"])
llm_mistral = LLMChain(prompt=prompt, llm=llm)

result = await llm_mistral.apredict(user_input="Explain AI for kids.")
print(result)

Result of this:

Hello there! I'd be happy to explain what AI is in simple terms for kids.

AI stands for Artificial Intelligence, which means that it's like making a computer or machine smarter than humans in some ways. Imagine if you could teach a robot to do something that only humans can do, like playing chess or recognizing faces. That's AI!

It's like making the computer more "smart" and able to understand things on its own, without needing someone to tell it what to do every step of the way. It's like having a super-smart friend who can help you with anything!

AI is used in many different ways, from playing games to helping doctors and scientists make better decisions. And as technology continues to advance, we'll see even more amazing things that AI can do. I hope this explanation has helped you understand what AI is for kids!

Let's test our code for the real use case: Coding. I’ve prepared two tests. Once a generation of code and once bug fixing.

Generate Code

code_generation = """Generate me a python class with two sorting algorithm. Quicksort and bubble sort."""

result = await llm_mistral .apredict(user_input=code_generation)

print(result)

```python
class SortingAlgorithms:
    def __init__(self):
        self.quicksort = quicksort
        self.bubblesort = bubblesort

    def quicksort(self, arr):
        """Implementation of quicksort algorithm"""
        if len(arr) <= 1:
            return arr
        else:
            pivot = arr[0]
            left = [x for x in arr[1:] if x < pivot]
            middle = [x for x in arr[1:] if x == pivot]
            right = [x for x in arr[1:] if x > pivot]
            return quicksort(left) + middle + quicksort(right)

    def bubblesort(self, arr):
        """Implementation of bubblesort algorithm"""
        n = len(arr)
        for i in range(n):
            for j in range(0, n-i-1):
                if arr[j] > arr[j+1]:
                    arr[j], arr[j+1] = arr[j+1], arr[j]
        return arr
```

Bug Fixing

For the bug fixing, I used previous generated quicksort function and changed a line from “middle = [x for x in arr[1:] if x == pivot]” to “middle = [x for x in arr[1:] if x <= pivot]”.

wrong_code = """Fix the following code:
def quicksort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x <= pivot]
    right = [x for x in arr if x > pivot]
    return quicksort(left) + middle + quicksort(right)"""

result = await llm_mistral .apredict(user_input=code_generation)

print(result)

def quicksort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quicksort(left) + middle + quicksort(right)
```
In the code, there was a mistake in the middle list. It should be `[x for x in arr if x == pivot]`, instead of `[x for x in arr if x <= pivot]`. This will ensure that the pivot element is included in the middle list and not in either the left or right lists.

As you can see, the model spotted the error and proposed a fix and an explanation. Impressive right?

Implementation of the TinyLLama-1b with Python and CPU usage only

Before I use the TinyLLLama model for translation, I would like to explain why TinyLLMs are essential and how they differ from larger ones. Although TinyLLMs have less knowledge about the world, they can usually understand the context and perform tasks like translation and summarization effectively.

from langchain.chains import LLMChain
from langchain.llms import CTransformers
from langchain.prompts import PromptTemplate
import time
from ctransformers import AutoModelForCausalLM,AutoConfig

llm = CTransformers(model="TheBloke/TinyLlama-1.1B-1T-OpenOrca-GGUF")
prompt = PromptTemplate(template=TEMPLATE, input_variables=["user_input"])
llm_llama = LLMChain(prompt=prompt, llm=llm)

text = """
Artificial intelligence (AI) is the creation of a robot or computer program capable of doing things for itself, without a human telling it exactly what to do. 
This means giving the computer the ability to think and learn like a person, so that it can solve problems and make decisions for itself. 
Just as you learn new things and grow, AI systems also learn from their experiences and improve over time. 
Is there anything else you'd like to know about AI?
"""

result = await llm_llama.apredict(user_input=text)
print(result)

Result:

L'intelligence artificielle (IA) est la création d'un robot ou d'un programme informatique capable de faire des choses par lui-même, sans qu'un humain lui dise exactement ce qu'il doit faire.
Il s'agit de donner à l'ordinateur la capacité de penser et d'apprendre comme une personne, afin qu'il puisse résoudre des problèmes et prendre des décisions par lui-même.
Tout comme vous apprenez de nouvelles choses et évoluez, les systèmes d'IA tirent également des enseignements de leurs expériences et s'améliorent avec le temps.
Y a-t-il autre chose que vous aimeriez savoir sur l'IA ?

Importance & Conclusion

The trend towards using specialized small Language Learning Models (LLMs) for specific tasks instead of large, generalized models marks a pivotal shift in artificial intelligence and machine learning, focusing on efficiency and precision.

Generalized models such as LLama2-70b or GPT-4 are excellent allrounder, but coming with an extremely high cost of deployment.

This approach can be dissected into three key areas:

Resource management and efficiency
Hosting behind APIs
Overall importance.

Resource Management and Efficiency

Large LLMs like GPT-4 demand considerable computational resources, including advanced GPUs and extensive RAM, both for training and deployment. Specialized small LLMs, in contrast, are designed to operate efficiently with fewer resources, making them more accessible and cost-effective for various organizations, particularly smaller ones. The reduced computational power required for these models also translates to lower power consumption, contributing to a more sustainable AI ecosystem.

Hosting Behind APIs

The deployment of specialized LLMs via APIs offers a flexible and scalable solution. This method allows businesses to deploy multiple instances of these models as needed, effectively managing demand without the overhead of large-scale infrastructure. APIs facilitate the easy integration of AI capabilities into existing systems, ensuring that even non-experts can leverage advanced AI tools. M

Moreover, the customizability offered by API-hosted LLMs means that businesses can have models tailored for specific tasks or industries, significantly enhancing the effectiveness and efficiency of these tools.

Importance of Specialized Small LLMs

Specialized LLMs are crucial for several reasons.

Firstly, they are optimized for specific tasks, yielding higher accuracy and efficiency in their designated areas. For instance, a model fine-tuned for medical diagnosis would significantly outperform a general-purpose model in that field.

Secondly, the reduced resource needs of small LLMs democratize AI, making advanced technologies accessible to smaller organizations and fostering a more inclusive AI landscape.

Lastly, the ease and cost-effectiveness of deploying and managing small LLMs encourage innovation and experimentation, essential for the growth and diversification of AI applications.

In summary, specialized small LLMs offer a more sustainable, efficient, and accessible approach to AI deployment. Their ability to be precisely tuned for specific tasks, coupled with their lower resource demands and ease of integration via APIs, positions them as a vital component in the evolving landscape of artificial intelligence.

AI in Finance