A model is a mathematical representation trained to recognise patterns in data. It transforms inputs into desired outputs. Below is a linear regression example using scikit-learn.
from sklearn.linear_model import LinearRegression
import numpy as np
# Create some sample data
X = np.array([[1], [2], [3], [4]])
y = np.array([2, 4, 6, 8]) # Linear relationship: y = 2x
# Create and train a model
model = LinearRegression()
model.fit(X, y)
# Make predictions
prediction = model.predict([[5]])
print(f"Predicted value: {prediction[0]}") # Should predict ~10
A prompt is like a question or a starting point you give to an AI system to get a response. Think of it as the input you provide to the AI to tell it what you want. For example:
A Large Language Model (LLM) is an artificial intelligence system that processes and generates text based on extensive training data. The core function is pattern recognition and text prediction at scale.
A token is a piece of text that the AI processes. Tokens can be as small as a single letter, punctuation mark, or word fragment. For example:
AI models use tokens to understand and generate text. The number of tokens in your prompt and the AI’s response affects processing time and cost. Generally, the longer the input or output, the more tokens are used.
An agent is a system that can perceive its environment and take actions to achieve specific goals. While a model makes predictions, an agent uses those predictions to make decisions. Modern AI assistants are examples of agents that combine multiple models and decision-making systems.
class SimpleAgent:
def __init__(self, model):
self.model = model
self.memory = []
def perceive(self, environment_state):
self.memory.append(environment_state)
def decide_action(self):
current_state = self.memory[-1]
prediction = self.model.predict([current_state])
return self.choose_best_action(prediction)
def choose_best_action(self, prediction):
# Logic to select the optimal action based on predictions
pass
from sklearn.ensemble import RandomForestClassifier
# Training data: [feature1, feature2] -> label
X_train = [[0, 0], [1, 1], [1, 0], [0, 1]]
y_train = [0, 1, 1, 0] # Binary classification labels
clf = RandomForestClassifier()
clf.fit(X_train, y_train)
Transformers are neural network architectures that revolutionized natural language processing and beyond. Unlike earlier sequential models, transformers process all input elements simultaneously and use a mechanism called “attention” to understand relationships between different parts of the input.
import torch
import torch.nn as nn
class SimpleTransformerBlock(nn.Module):
def __init__(self, embed_dim, num_heads):
super().__init__()
self.attention = nn.MultiheadAttention(embed_dim, num_heads)
self.feed_forward = nn.Sequential(
nn.Linear(embed_dim, embed_dim * 4),
nn.ReLU(),
nn.Linear(embed_dim * 4, embed_dim)
)
self.norm1 = nn.LayerNorm(embed_dim)
self.norm2 = nn.LayerNorm(embed_dim)
def forward(self, x):
# Self-attention mechanism
attention_output, _ = self.attention(x, x, x)
x = self.norm1(x + attention_output)
# Feed-forward network
ff_output = self.feed_forward(x)
x = self.norm2(x + ff_output)
return x
import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer
class InferenceService:
def __init__(self):
self.tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
self.model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased")
def predict(self, text):
inputs = self.tokenizer(text, return_tensors="pt")
outputs = self.model(**inputs)
return outputs.logits
PyTorch is particularly useful for rapid prototyping:
import torch
import torch.nn as nn
# Define a simple neural network
class SimpleNet(nn.Module):
def __init__(self):
super().__init__()
self.layers = nn.Sequential(
nn.Linear(784, 256),
nn.ReLU(),
nn.Linear(256, 10)
)
def forward(self, x):
return self.layers(x)
# Training loop example
model = SimpleNet()
optimizer = torch.optim.Adam(model.parameters())
loss_fn = nn.CrossEntropyLoss()
for epoch in range(epochs):
optimizer.zero_grad()
outputs = model(inputs)
loss = loss_fn(outputs, targets)
loss.backward()
optimizer.step()
TensorFlow excels in production deployment and mobile applications. It offers robust tools for model serving and optimization:
import tensorflow as tf
# Define a model using Keras (TensorFlow's high-level API)
model = tf.keras.Sequential([
tf.keras.layers.Dense(256, activation='relu', input_shape=(784,)),
tf.keras.layers.Dense(10, activation='softmax')
])
# Compile and train
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)
Hugging Face has become the go-to platform for pre-trained models and NLP tools. Their transformers
library provides easy access to models:
from transformers import pipeline
# Text generation using GPT-2
generator = pipeline('text-generation', model='gpt2')
text = generator("AI has transformed", max_length=50)
# Question answering using BERT
qa_model = pipeline('question-answering', model='bert-large-uncased-whole-word-masking-finetuned-squad')
result = qa_model(question="What is AI?", context="AI is artificial intelligence...")
Popular open-source models and their uses:
BERT (Bidirectional Encoder Representations from Transformers)
GPT-2
DALL-E mini (now Craiyon)
Stable Diffusion
When deploying models in production:
Model Serving Options:
Optimization Techniques:
AI systems impact individuals and society in significant ways: