umma.dev

AI Jargon

Models

A model is a mathematical representation trained to recognise patterns in data. It transforms inputs into desired outputs. Below is a linear regression example using scikit-learn.

from sklearn.linear_model import LinearRegression
import numpy as np

# Create some sample data
X = np.array([[1], [2], [3], [4]])
y = np.array([2, 4, 6, 8])  # Linear relationship: y = 2x

# Create and train a model
model = LinearRegression()
model.fit(X, y)

# Make predictions
prediction = model.predict([[5]])
print(f"Predicted value: {prediction[0]}")  # Should predict ~10

Prompts

A prompt is like a question or a starting point you give to an AI system to get a response. Think of it as the input you provide to the AI to tell it what you want. For example:

  • If you ask, “What’s the capital of France?” the prompt is the question itself.
  • Similarly, a creative prompt could be, “Write a short poem about the ocean.”

LLMs

A Large Language Model (LLM) is an artificial intelligence system that processes and generates text based on extensive training data. The core function is pattern recognition and text prediction at scale.

Tokens

A token is a piece of text that the AI processes. Tokens can be as small as a single letter, punctuation mark, or word fragment. For example:

  • The sentence, “AI is fun!” might break into tokens like “AI”, ” is”, and ” fun!” depending on the AI system.

AI models use tokens to understand and generate text. The number of tokens in your prompt and the AI’s response affects processing time and cost. Generally, the longer the input or output, the more tokens are used.

Agents

An agent is a system that can perceive its environment and take actions to achieve specific goals. While a model makes predictions, an agent uses those predictions to make decisions. Modern AI assistants are examples of agents that combine multiple models and decision-making systems.

class SimpleAgent:
    def __init__(self, model):
        self.model = model
        self.memory = []

    def perceive(self, environment_state):
        self.memory.append(environment_state)

    def decide_action(self):
        current_state = self.memory[-1]
        prediction = self.model.predict([current_state])
        return self.choose_best_action(prediction)

    def choose_best_action(self, prediction):
        # Logic to select the optimal action based on predictions
        pass

Core Machine Learning Concepts

  1. Supervised Learning: Training with labeled data The model learns from examples where the correct answer is provided:
from sklearn.ensemble import RandomForestClassifier

# Training data: [feature1, feature2] -> label
X_train = [[0, 0], [1, 1], [1, 0], [0, 1]]
y_train = [0, 1, 1, 0]  # Binary classification labels

clf = RandomForestClassifier()
clf.fit(X_train, y_train)
  1. Unsupervised Learning: Finding patterns without labels
  2. Reinforcement Learning: Learning through trial and error with rewards
  3. Transfer Learning: Using knowledge from one task to improve performance on another

Transformers: The Architecture Behind Modern AI

Transformers are neural network architectures that revolutionized natural language processing and beyond. Unlike earlier sequential models, transformers process all input elements simultaneously and use a mechanism called “attention” to understand relationships between different parts of the input.

import torch
import torch.nn as nn

class SimpleTransformerBlock(nn.Module):
    def __init__(self, embed_dim, num_heads):
        super().__init__()
        self.attention = nn.MultiheadAttention(embed_dim, num_heads)
        self.feed_forward = nn.Sequential(
            nn.Linear(embed_dim, embed_dim * 4),
            nn.ReLU(),
            nn.Linear(embed_dim * 4, embed_dim)
        )
        self.norm1 = nn.LayerNorm(embed_dim)
        self.norm2 = nn.LayerNorm(embed_dim)

    def forward(self, x):
        # Self-attention mechanism
        attention_output, _ = self.attention(x, x, x)
        x = self.norm1(x + attention_output)

        # Feed-forward network
        ff_output = self.feed_forward(x)
        x = self.norm2(x + ff_output)
        return x
import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer

class InferenceService:
    def __init__(self):
        self.tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
        self.model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased")

    def predict(self, text):
        inputs = self.tokenizer(text, return_tensors="pt")
        outputs = self.model(**inputs)
        return outputs.logits

Frameworks and Ecosystems

PyTorch

PyTorch is particularly useful for rapid prototyping:

import torch
import torch.nn as nn

# Define a simple neural network
class SimpleNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.layers = nn.Sequential(
            nn.Linear(784, 256),
            nn.ReLU(),
            nn.Linear(256, 10)
        )

    def forward(self, x):
        return self.layers(x)

# Training loop example
model = SimpleNet()
optimizer = torch.optim.Adam(model.parameters())
loss_fn = nn.CrossEntropyLoss()

for epoch in range(epochs):
    optimizer.zero_grad()
    outputs = model(inputs)
    loss = loss_fn(outputs, targets)
    loss.backward()
    optimizer.step()

TensorFlow

TensorFlow excels in production deployment and mobile applications. It offers robust tools for model serving and optimization:

import tensorflow as tf

# Define a model using Keras (TensorFlow's high-level API)
model = tf.keras.Sequential([
    tf.keras.layers.Dense(256, activation='relu', input_shape=(784,)),
    tf.keras.layers.Dense(10, activation='softmax')
])

# Compile and train
model.compile(optimizer='adam',
             loss='sparse_categorical_crossentropy',
             metrics=['accuracy'])

model.fit(x_train, y_train, epochs=5)

Hugging Face

Hugging Face has become the go-to platform for pre-trained models and NLP tools. Their transformers library provides easy access to models:

from transformers import pipeline

# Text generation using GPT-2
generator = pipeline('text-generation', model='gpt2')
text = generator("AI has transformed", max_length=50)

# Question answering using BERT
qa_model = pipeline('question-answering', model='bert-large-uncased-whole-word-masking-finetuned-squad')
result = qa_model(question="What is AI?", context="AI is artificial intelligence...")

Popular open-source models and their uses:

  1. BERT (Bidirectional Encoder Representations from Transformers)

    • Text classification
    • Question answering
    • Named entity recognition
  2. GPT-2

    • Text generation
    • Translation
    • Summarization
  3. DALL-E mini (now Craiyon)

    • Text-to-image generation
    • Visual concept exploration
  4. Stable Diffusion

    • Image generation and editing
    • Style transfer
    • Image-to-image translation

Model Deployment and Scaling

When deploying models in production:

  1. Model Serving Options:

    • TensorFlow Serving
    • PyTorch TorchServe
    • ONNX Runtime
    • Hugging Face Inference Endpoints
  2. Optimization Techniques:

    • Quantization
    • Pruning
    • Knowledge distillation
    • Model compression

Ethical Considerations

AI systems impact individuals and society in significant ways:

  • Data Privacy: Consider what data you collect and how it’s stored
  • Bias: Models can perpetuate or amplify existing biases in training data
  • Transparency: Users should understand when they’re interacting with AI
  • Environmental Impact: Large model training consumes significant computational resources