Machine Learning with Python | Crusader Two-One

Machine learning in Python has revolutionized data science and artificial intelligence, providing accessible yet powerful tools for building predictive models, neural networks, and intelligent systems. Python's rich ecosystem of libraries, combined with its simplicity and extensive community support, makes it the dominant language for machine learning research and production applications.

Overview

Python's machine learning ecosystem offers solutions for every stage of the ML lifecycle, from data preprocessing and exploratory analysis to model training, evaluation, hyperparameter tuning, and deployment. The landscape is anchored by three major frameworks: scikit-learn for traditional machine learning, TensorFlow for production-scale deep learning, and PyTorch for research and dynamic neural networks.

This guide covers the fundamental concepts, practical implementations, best practices, and real-world workflows for building machine learning systems in Python.

Core Machine Learning Libraries

Scikit-learn

Scikit-learn is the foundational library for traditional machine learning in Python, offering simple and efficient tools for data mining and analysis.

Scikit-learn Key Features

Consistent API: All algorithms follow the fit/predict pattern
Comprehensive algorithms: Classification, regression, clustering, dimensionality reduction
Model selection: Cross-validation, grid search, metrics
Preprocessing: Scaling, encoding, feature extraction
Pipeline support: Chain transformers and estimators
Well-documented: Extensive examples and user guide

Installing Scikit-learn

pip install scikit-learn
# or with UV for faster installation
uv pip install scikit-learn

Basic Example - Classification

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report

# Load dataset
iris = load_iris()
X, y = iris.data, iris.target

# Split data
XTrain, XTest, yTrain, yTest = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Train model
clf = RandomForestClassifier(n_estimators=100, random_state=42)
clf.fit(XTrain, yTrain)

# Predict and evaluate
yPred = clf.predict(XTest)
print(f"Accuracy: {accuracy_score(yTest, yPred):.3f}")
print(classification_report(yTest, yPred, target_names=iris.target_names))

Common Algorithms

Classification:

Logistic Regression
Support Vector Machines (SVM)
Decision Trees
Random Forests
Gradient Boosting (XGBoost, LightGBM)
Naive Bayes
K-Nearest Neighbors (KNN)

Regression:

Linear Regression
Ridge/Lasso Regression
Support Vector Regression
Decision Tree Regressor
Random Forest Regressor
Gradient Boosting Regressor

Clustering:

K-Means
DBSCAN
Hierarchical Clustering
Gaussian Mixture Models

Dimensionality Reduction:

Principal Component Analysis (PCA)
t-SNE
UMAP
Linear Discriminant Analysis (LDA)

TensorFlow and Keras

TensorFlow is Google's open-source deep learning framework, with Keras as its high-level API for building and training neural networks.

TensorFlow Key Features

Production-ready: Industry-standard for deployment
TensorFlow Serving: Model serving infrastructure
TensorFlow Lite: Mobile and embedded deployment
TensorFlow.js: Browser-based machine learning
Distributed training: Multi-GPU and TPU support
TensorBoard: Visualization and monitoring
Keras API: Simple, intuitive model building

Installing TensorFlow

pip install tensorflow
# or with UV
uv pip install tensorflow

TensorFlow Basic Example

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np

# Load dataset
(XTrain, yTrain), (XTest, yTest) = keras.datasets.mnist.load_data()

# Normalize pixel values
XTrain = XTrain.astype("float32") / 255.0
XTest = XTest.astype("float32") / 255.0

# Build model
model = keras.Sequential([
    layers.Flatten(input_shape=(28, 28)),
    layers.Dense(128, activation="relu"),
    layers.Dropout(0.2),
    layers.Dense(10, activation="softmax")
])

# Compile model
model.compile(
    optimizer="adam",
    loss="sparse_categorical_crossentropy",
    metrics=["accuracy"]
)

# Train model
history = model.fit(
    XTrain, yTrain,
    epochs=5,
    validation_split=0.1,
    batch_size=32,
    verbose=1
)

# Evaluate
testLoss, testAcc = model.evaluate(XTest, yTest, verbose=0)
print(f"Test accuracy: {testAcc:.4f}")

Advanced Model Architectures

# Convolutional Neural Network for image classification
def CreateCNN():
    model = keras.Sequential([
        layers.Conv2D(32, (3, 3), activation="relu", input_shape=(28, 28, 1)),
        layers.MaxPooling2D((2, 2)),
        layers.Conv2D(64, (3, 3), activation="relu"),
        layers.MaxPooling2D((2, 2)),
        layers.Conv2D(64, (3, 3), activation="relu"),
        layers.Flatten(),
        layers.Dense(64, activation="relu"),
        layers.Dense(10, activation="softmax")
    ])
    return model

# Recurrent Neural Network for sequence data
def CreateRNN():
    model = keras.Sequential([
        layers.LSTM(128, return_sequences=True, input_shape=(None, 1)),
        layers.LSTM(64),
        layers.Dense(1)
    ])
    return model

# Transfer Learning with pre-trained model
baseModel = keras.applications.MobileNetV2(
    input_shape=(224, 224, 3),
    include_top=False,
    weights="imagenet"
)
baseModel.trainable = False

model = keras.Sequential([
    baseModel,
    layers.GlobalAveragePooling2D(),
    layers.Dense(128, activation="relu"),
    layers.Dropout(0.5),
    layers.Dense(10, activation="softmax")
])

PyTorch

PyTorch is Facebook's deep learning framework, favored by researchers for its dynamic computation graphs and Pythonic design.

Key Features

Dynamic computation graphs: Define-by-run paradigm
Pythonic: Feels natural to Python developers
Research-friendly: Rapid prototyping and experimentation
TorchScript: Production deployment
Strong GPU acceleration: Seamless CUDA integration
Active community: Cutting-edge research implementations
PyTorch Lightning: High-level training framework

Installation

pip install torch torchvision torchaudio
# or with UV
uv pip install torch torchvision torchaudio

PyTorch Basic Example

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision import datasets, transforms

# Define model
class NeuralNet(nn.Module):
    def __init__(self):
        super(NeuralNet, self).__init__()
        self.flatten = nn.Flatten()
        self.linearRelu = nn.Sequential(
            nn.Linear(28*28, 128),
            nn.ReLU(),
            nn.Dropout(0.2),
            nn.Linear(128, 10)
        )
    
    def forward(self, x):
        x = self.flatten(x)
        logits = self.linearRelu(x)
        return logits

# Load data
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

trainData = datasets.MNIST(
    root="data",
    train=True,
    download=True,
    transform=transform
)

trainLoader = DataLoader(trainData, batch_size=32, shuffle=True)

# Initialize model, loss, optimizer
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = NeuralNet().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Training loop
model.train()
for epoch in range(5):
    runningLoss = 0.0
    for inputs, labels in trainLoader:
        inputs, labels = inputs.to(device), labels.to(device)
        
        # Forward pass
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        
        # Backward pass and optimization
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        runningLoss += loss.item()
    
    print(f"Epoch {epoch+1}, Loss: {runningLoss/len(trainLoader):.4f}")

Advanced PyTorch Patterns

# Custom dataset
class CustomDataset(torch.utils.data.Dataset):
    def __init__(self, data, labels, transform=None):
        self.data = data
        self.labels = labels
        self.transform = transform
    
    def __len__(self):
        return len(self.data)
    
    def __getitem__(self, idx):
        sample = self.data[idx]
        label = self.labels[idx]
        
        if self.transform:
            sample = self.transform(sample)
        
        return sample, label

# Learning rate scheduling
scheduler = optim.lr_scheduler.ReduceLROnPlateau(
    optimizer,
    mode="min",
    factor=0.1,
    patience=5
)

# Early stopping
class EarlyStopping:
    def __init__(self, patience=7, min_delta=0):
        self.patience = patience
        self.min_delta = min_delta
        self.counter = 0
        self.bestLoss = None
        self.earlystop = False
    
    def __call__(self, valLoss):
        if self.bestLoss is None:
            self.bestLoss = valLoss
        elif valLoss > self.bestLoss - self.min_delta:
            self.counter += 1
            if self.counter >= self.patience:
                self.earlystop = True
        else:
            self.bestLoss = valLoss
            self.counter = 0

Data Preprocessing

Data Loading and Exploration

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Load data
df = pd.read_csv("data.csv")

# Explore data
print(df.info())
print(df.describe())
print(df.isnull().sum())

# Visualize distributions
df.hist(figsize=(12, 10))
plt.tight_layout()
plt.show()

# Correlation heatmap
plt.figure(figsize=(10, 8))
sns.heatmap(df.corr(), annot=True, cmap="coolwarm")
plt.show()

Feature Engineering

from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.impute import SimpleImputer

# Handle missing values
imputer = SimpleImputer(strategy="mean")
dfImputed = pd.DataFrame(
    imputer.fit_transform(df),
    columns=df.columns
)

# Encode categorical variables
le = LabelEncoder()
df["category_encoded"] = le.fit_transform(df["category"])

# Or use one-hot encoding
dfEncoded = pd.get_dummies(df, columns=["category"], prefix="cat")

# Scale features
scaler = StandardScaler()
XScaled = scaler.fit_transform(X)

# Create new features
df["feature_ratio"] = df["feature1"] / (df["feature2"] + 1)
df["feature_interaction"] = df["feature1"] * df["feature2"]
df["feature_log"] = np.log1p(df["feature1"])

Data Splitting

from sklearn.model_selection import train_test_split

# Basic split
XTrain, XTest, yTrain, yTest = train_test_split(
    X, y,
    test_size=0.2,
    random_state=42,
    stratify=y  # Maintain class distribution
)

# Train/validation/test split
XTrain, XTemp, yTrain, yTemp = train_test_split(
    X, y, test_size=0.3, random_state=42
)
XVal, XTest, yVal, yTest = train_test_split(
    XTemp, yTemp, test_size=0.5, random_state=42
)

Model Training

Cross-Validation

from sklearn.model_selection import cross_val_score, KFold

# K-Fold cross-validation
kfold = KFold(n_splits=5, shuffle=True, random_state=42)
scores = cross_val_score(
    model,
    X,
    y,
    cv=kfold,
    scoring="accuracy"
)

print(f"Cross-validation scores: {scores}")
print(f"Mean accuracy: {scores.mean():.3f} (+/- {scores.std()*2:.3f})")

# Stratified K-Fold for imbalanced data
from sklearn.model_selection import StratifiedKFold

skfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
scores = cross_val_score(model, X, y, cv=skfold)

Hyperparameter Tuning

from sklearn.model_selection import GridSearchCV, RandomizedSearchCV

# Grid search
paramGrid = {
    "n_estimators": [50, 100, 200],
    "max_depth": [None, 10, 20, 30],
    "min_samples_split": [2, 5, 10],
    "min_samples_leaf": [1, 2, 4]
}

gridSearch = GridSearchCV(
    RandomForestClassifier(random_state=42),
    paramGrid,
    cv=5,
    scoring="accuracy",
    n_jobs=-1,
    verbose=1
)

gridSearch.fit(XTrain, yTrain)
print(f"Best parameters: {gridSearch.best_params_}")
print(f"Best score: {gridSearch.best_score_:.3f}")

# Randomized search (faster for large parameter spaces)
from scipy.stats import randint, uniform

paramDist = {
    "n_estimators": randint(50, 500),
    "max_depth": [None] + list(range(10, 50)),
    "min_samples_split": randint(2, 20),
    "min_samples_leaf": randint(1, 10)
}

randomSearch = RandomizedSearchCV(
    RandomForestClassifier(random_state=42),
    paramDist,
    n_iter=100,
    cv=5,
    scoring="accuracy",
    n_jobs=-1,
    random_state=42
)

randomSearch.fit(XTrain, yTrain)

Pipeline Creation

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.feature_selection import SelectKBest, f_classif

# Create pipeline
pipeline = Pipeline([
    ("imputer", SimpleImputer(strategy="mean")),
    ("scaler", StandardScaler()),
    ("feature_selection", SelectKBest(f_classif, k=10)),
    ("classifier", RandomForestClassifier(random_state=42))
])

# Fit pipeline
pipeline.fit(XTrain, yTrain)

# Predict
yPred = pipeline.predict(XTest)

# Can also use grid search with pipeline
pipelineParams = {
    "feature_selection__k": [5, 10, 15],
    "classifier__n_estimators": [50, 100, 200],
    "classifier__max_depth": [None, 10, 20]
}

gridSearch = GridSearchCV(pipeline, pipelineParams, cv=5)
gridSearch.fit(XTrain, yTrain)

Model Evaluation

Classification Metrics

from sklearn.metrics import (
    accuracy_score,
    precision_score,
    recall_score,
    f1_score,
    confusion_matrix,
    classification_report,
    roc_auc_score,
    roc_curve
)

# Basic metrics
accuracy = accuracy_score(yTest, yPred)
precision = precision_score(yTest, yPred, average="weighted")
recall = recall_score(yTest, yPred, average="weighted")
f1 = f1_score(yTest, yPred, average="weighted")

print(f"Accuracy: {accuracy:.3f}")
print(f"Precision: {precision:.3f}")
print(f"Recall: {recall:.3f}")
print(f"F1-Score: {f1:.3f}")

# Confusion matrix
cm = confusion_matrix(yTest, yPred)
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt="d", cmap="Blues")
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix")
plt.show()

# Classification report
print(classification_report(yTest, yPred))

# ROC curve and AUC
yPredProba = model.predict_proba(XTest)[:, 1]
fpr, tpr, thresholds = roc_curve(yTest, yPredProba)
auc = roc_auc_score(yTest, yPredProba)

plt.figure(figsize=(8, 6))
plt.plot(fpr, tpr, label=f"AUC = {auc:.3f}")
plt.plot([0, 1], [0, 1], "k--")
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("ROC Curve")
plt.legend()
plt.show()

Regression Metrics

from sklearn.metrics import (
    mean_squared_error,
    mean_absolute_error,
    r2_score,
    mean_absolute_percentage_error
)

# Calculate metrics
mse = mean_squared_error(yTest, yPred)
rmse = np.sqrt(mse)
mae = mean_absolute_error(yTest, yPred)
r2 = r2_score(yTest, yPred)
mape = mean_absolute_percentage_error(yTest, yPred)

print(f"MSE: {mse:.3f}")
print(f"RMSE: {rmse:.3f}")
print(f"MAE: {mae:.3f}")
print(f"R² Score: {r2:.3f}")
print(f"MAPE: {mape:.3f}")

# Residual plot
residuals = yTest - yPred
plt.figure(figsize=(10, 6))
plt.scatter(yPred, residuals, alpha=0.5)
plt.axhline(y=0, color="r", linestyle="--")
plt.xlabel("Predicted Values")
plt.ylabel("Residuals")
plt.title("Residual Plot")
plt.show()

# Actual vs Predicted
plt.figure(figsize=(10, 6))
plt.scatter(yTest, yPred, alpha=0.5)
plt.plot([yTest.min(), yTest.max()], [yTest.min(), yTest.max()], "r--", lw=2)
plt.xlabel("Actual Values")
plt.ylabel("Predicted Values")
plt.title("Actual vs Predicted")
plt.show()

Advanced Techniques

Ensemble Methods

from sklearn.ensemble import (
    VotingClassifier,
    BaggingClassifier,
    AdaBoostClassifier,
    StackingClassifier
)
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC

# Voting ensemble
clf1 = LogisticRegression(random_state=42)
clf2 = RandomForestClassifier(random_state=42)
clf3 = SVC(probability=True, random_state=42)

votingClf = VotingClassifier(
    estimators=[("lr", clf1), ("rf", clf2), ("svc", clf3)],
    voting="soft"
)
votingClf.fit(XTrain, yTrain)

# Bagging
baggingClf = BaggingClassifier(
    DecisionTreeClassifier(),
    n_estimators=100,
    max_samples=0.8,
    random_state=42
)
baggingClf.fit(XTrain, yTrain)

# Boosting (AdaBoost)
adaClf = AdaBoostClassifier(
    n_estimators=100,
    learning_rate=1.0,
    random_state=42
)
adaClf.fit(XTrain, yTrain)

# Stacking
baseEstimators = [
    ("rf", RandomForestClassifier(n_estimators=10, random_state=42)),
    ("svc", SVC(random_state=42))
]
stackingClf = StackingClassifier(
    estimators=baseEstimators,
    final_estimator=LogisticRegression()
)
stackingClf.fit(XTrain, yTrain)

Feature Importance

import matplotlib.pyplot as plt

# For tree-based models
importance = model.feature_importances_
indices = np.argsort(importance)[::-1]

plt.figure(figsize=(10, 6))
plt.title("Feature Importances")
plt.bar(range(X.shape[1]), importance[indices])
plt.xticks(range(X.shape[1]), [featureNames[i] for i in indices], rotation=90)
plt.tight_layout()
plt.show()

# Permutation importance (model-agnostic)
from sklearn.inspection import permutation_importance

perm_importance = permutation_importance(
    model, XTest, yTest,
    n_repeats=10,
    random_state=42
)

sortedIdx = perm_importance.importances_mean.argsort()[::-1]
plt.figure(figsize=(10, 6))
plt.barh(range(len(sortedIdx)), perm_importance.importances_mean[sortedIdx])
plt.yticks(range(len(sortedIdx)), [featureNames[i] for i in sortedIdx])
plt.xlabel("Permutation Importance")
plt.tight_layout()
plt.show()

Handling Imbalanced Data

from imblearn.over_sampling import SMOTE, ADASYN
from imblearn.under_sampling import RandomUnderSampler
from imblearn.combine import SMOTETomek

# SMOTE (Synthetic Minority Over-sampling)
smote = SMOTE(random_state=42)
XResampled, yResampled = smote.fit_resample(XTrain, yTrain)

# ADASYN (Adaptive Synthetic Sampling)
adasyn = ADASYN(random_state=42)
XResampled, yResampled = adasyn.fit_resample(XTrain, yTrain)

# Combined approach
smt = SMOTETomek(random_state=42)
XResampled, yResampled = smt.fit_resample(XTrain, yTrain)

# Class weights
from sklearn.utils.class_weight import compute_class_weight

classWeights = compute_class_weight(
    "balanced",
    classes=np.unique(yTrain),
    y=yTrain
)
classWeightDict = dict(enumerate(classWeights))

model = RandomForestClassifier(
    class_weight=classWeightDict,
    random_state=42
)

Dimensionality Reduction

from sklearn.decomposition import PCA
from sklearn.manifold import TSNE
import umap

# PCA
pca = PCA(n_components=0.95)  # Retain 95% variance
XPca = pca.fit_transform(XScaled)
print(f"Original dimensions: {X.shape[1]}")
print(f"Reduced dimensions: {XPca.shape[1]}")

# Visualize explained variance
plt.figure(figsize=(10, 6))
plt.plot(np.cumsum(pca.explained_variance_ratio_))
plt.xlabel("Number of Components")
plt.ylabel("Cumulative Explained Variance")
plt.title("PCA Explained Variance")
plt.grid()
plt.show()

# t-SNE for visualization
tsne = TSNE(n_components=2, random_state=42)
XTsne = tsne.fit_transform(XScaled)

plt.figure(figsize=(10, 8))
plt.scatter(XTsne[:, 0], XTsne[:, 1], c=y, cmap="viridis", alpha=0.6)
plt.colorbar()
plt.title("t-SNE Visualization")
plt.show()

# UMAP (faster alternative to t-SNE)
reducer = umap.UMAP(random_state=42)
XUmap = reducer.fit_transform(XScaled)

plt.figure(figsize=(10, 8))
plt.scatter(XUmap[:, 0], XUmap[:, 1], c=y, cmap="viridis", alpha=0.6)
plt.colorbar()
plt.title("UMAP Visualization")
plt.show()

Deep Learning Advanced Topics

Custom Training Loops (PyTorch)

def Train(model, trainLoader, valLoader, criterion, optimizer, epochs=10):
    history = {"train_loss": [], "val_loss": [], "val_acc": []}
    
    for epoch in range(epochs):
        # Training phase
        model.train()
        trainLoss = 0.0
        
        for inputs, labels in trainLoader:
            inputs, labels = inputs.to(device), labels.to(device)
            
            optimizer.zero_grad()
            outputs = model(inputs)
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()
            
            trainLoss += loss.item()
        
        # Validation phase
        model.eval()
        valLoss = 0.0
        correct = 0
        total = 0
        
        with torch.no_grad():
            for inputs, labels in valLoader:
                inputs, labels = inputs.to(device), labels.to(device)
                outputs = model(inputs)
                loss = criterion(outputs, labels)
                valLoss += loss.item()
                
                _, predicted = torch.max(outputs.data, 1)
                total += labels.size(0)
                correct += (predicted == labels).sum().item()
        
        # Record metrics
        avgTrainLoss = trainLoss / len(trainLoader)
        avgValLoss = valLoss / len(valLoader)
        valAccuracy = 100 * correct / total
        
        history["train_loss"].append(avgTrainLoss)
        history["val_loss"].append(avgValLoss)
        history["val_acc"].append(valAccuracy)
        
        print(f"Epoch {epoch+1}/{epochs}")
        print(f"Train Loss: {avgTrainLoss:.4f}")
        print(f"Val Loss: {avgValLoss:.4f}, Val Acc: {valAccuracy:.2f}%")
    
    return history

Transfer Learning

# TensorFlow/Keras
baseModel = keras.applications.ResNet50(
    weights="imagenet",
    include_top=False,
    input_shape=(224, 224, 3)
)

# Freeze base model layers
baseModel.trainable = False

# Add custom layers
model = keras.Sequential([
    baseModel,
    layers.GlobalAveragePooling2D(),
    layers.Dense(256, activation="relu"),
    layers.Dropout(0.5),
    layers.Dense(numClasses, activation="softmax")
])

# Compile and train
model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=0.001),
    loss="categorical_crossentropy",
    metrics=["accuracy"]
)

# Fine-tuning: Unfreeze some layers after initial training
baseModel.trainable = True
for layer in baseModel.layers[:-30]:
    layer.trainable = False

model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=0.0001),
    loss="categorical_crossentropy",
    metrics=["accuracy"]
)

Data Augmentation

# TensorFlow/Keras
dataAugmentation = keras.Sequential([
    layers.RandomFlip("horizontal"),
    layers.RandomRotation(0.2),
    layers.RandomZoom(0.2),
    layers.RandomTranslation(0.1, 0.1),
    layers.RandomContrast(0.2)
])

# PyTorch
import torchvision.transforms as transforms

transform = transforms.Compose([
    transforms.RandomHorizontalFlip(),
    transforms.RandomRotation(20),
    transforms.ColorJitter(brightness=0.2, contrast=0.2),
    transforms.RandomResizedCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

Callbacks and Monitoring

# TensorFlow/Keras callbacks
from tensorflow.keras.callbacks import (
    EarlyStopping,
    ModelCheckpoint,
    ReduceLROnPlateau,
    TensorBoard
)

callbacks = [
    EarlyStopping(
        monitor="val_loss",
        patience=10,
        restore_best_weights=True
    ),
    ModelCheckpoint(
        filepath="best_model.h5",
        monitor="val_accuracy",
        save_best_only=True
    ),
    ReduceLROnPlateau(
        monitor="val_loss",
        factor=0.5,
        patience=5,
        min_lr=1e-7
    ),
    TensorBoard(log_dir="./logs")
]

history = model.fit(
    XTrain, yTrain,
    validation_data=(XVal, yVal),
    epochs=100,
    callbacks=callbacks
)

Model Deployment

Model Serialization

# Scikit-learn with joblib
import joblib

# Save model
joblib.dump(model, "model.pkl")

# Load model
loadedModel = joblib.load("model.pkl")

# TensorFlow/Keras
# Save entire model
model.save("model.h5")

# Load model
loadedModel = keras.models.load_model("model.h5")

# Save weights only
model.save_weights("model_weights.h5")

# PyTorch
# Save model
torch.save(model.state_dict(), "model.pth")

# Load model
model = NeuralNet()
model.load_state_dict(torch.load("model.pth"))
model.eval()

Flask API

from flask import Flask, request, jsonify
import joblib
import numpy as np

app = Flask(__name__)

# Load model at startup
model = joblib.load("model.pkl")

@app.route("/predict", methods=["POST"])
def Predict():
    try:
        data = request.get_json()
        features = np.array(data["features"]).reshape(1, -1)
        
        prediction = model.predict(features)
        probability = model.predict_proba(features)
        
        return jsonify({
            "prediction": int(prediction[0]),
            "probability": probability[0].tolist()
        })
    except Exception as e:
        return jsonify({"error": str(e)}), 400

if __name__ == "__main__":
    app.run(debug=False, host="0.0.0.0", port=5000)

FastAPI (Modern Alternative)

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import joblib
import numpy as np

app = FastAPI()

# Load model
model = joblib.load("model.pkl")

class PredictionInput(BaseModel):
    features: list[float]

class PredictionOutput(BaseModel):
    prediction: int
    probability: list[float]

@app.post("/predict", response_model=PredictionOutput)
async def Predict(input_data: PredictionInput):
    try:
        features = np.array(input_data.features).reshape(1, -1)
        prediction = model.predict(features)
        probability = model.predict_proba(features)
        
        return PredictionOutput(
            prediction=int(prediction[0]),
            probability=probability[0].tolist()
        )
    except Exception as e:
        raise HTTPException(status_code=400, detail=str(e))

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8000)

Docker Deployment

FROM python:3.11-slim

WORKDIR /app

# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy model and application
COPY model.pkl .
COPY app.py .

# Expose port
EXPOSE 8000

# Run application
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

ONNX Export

# PyTorch to ONNX
dummyInput = torch.randn(1, 3, 224, 224).to(device)

torch.onnx.export(
    model,
    dummyInput,
    "model.onnx",
    export_params=True,
    opset_version=11,
    input_names=["input"],
    output_names=["output"]
)

# Load and run ONNX model
import onnxruntime as ort

session = ort.InferenceSession("model.onnx")
input_name = session.get_inputs()[0].name
output = session.run(None, {input_name: dummyInput.cpu().numpy()})

Best Practices

Project Structure

ml-project/
├── data/
│   ├── raw/
│   ├── processed/
│   └── external/
├── notebooks/
│   ├── 01-exploration.ipynb
│   ├── 02-feature-engineering.ipynb
│   └── 03-modeling.ipynb
├── src/
│   ├── __init__.py
│   ├── data/
│   │   ├── __init__.py
│   │   └── preprocessing.py
│   ├── features/
│   │   ├── __init__.py
│   │   └── engineering.py
│   ├── models/
│   │   ├── __init__.py
│   │   ├── train.py
│   │   └── predict.py
│   └── utils/
│       ├── __init__.py
│       └── helpers.py
├── tests/
│   └── test_models.py
├── models/
│   └── trained_model.pkl
├── requirements.txt
├── setup.py
└── README.md

Version Control

# Track model versions with MLflow
import mlflow

mlflow.start_run()

# Log parameters
mlflow.log_params({
    "n_estimators": 100,
    "max_depth": 10,
    "learning_rate": 0.01
})

# Log metrics
mlflow.log_metrics({
    "accuracy": accuracy,
    "f1_score": f1
})

# Log model
mlflow.sklearn.log_model(model, "model")

mlflow.end_run()

Code Quality

# Type hints
from typing import Tuple, List
import numpy.typing as npt

def PreprocessData(
    data: pd.DataFrame,
    targetColumn: str
) -> Tuple[npt.NDArray, npt.NDArray]:
    """
    Preprocess raw data for modeling.
    
    Parameters
    ----------
    data : pd.DataFrame
        Raw input data
    targetColumn : str
        Name of target column
    
    Returns
    -------
    Tuple[npt.NDArray, npt.NDArray]
        Features and target arrays
    """
    X = data.drop(columns=[targetColumn]).values
    y = data[targetColumn].values
    return X, y

# Unit tests
import pytest

def TestModelPrediction():
    model = LoadModel("model.pkl")
    testInput = np.array([[1.0, 2.0, 3.0]])
    prediction = model.predict(testInput)
    
    assert prediction is not None
    assert len(prediction) == 1

Reproducibility

# Set random seeds
import random
import numpy as np
import torch
import tensorflow as tf

def SetSeeds(seed: int = 42):
    """Set random seeds for reproducibility."""
    random.seed(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)
    tf.random.set_seed(seed)
    
    # Additional PyTorch settings
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False

SetSeeds(42)

Performance Optimization

GPU Acceleration

# TensorFlow GPU configuration
gpus = tf.config.list_physical_devices("GPU")
if gpus:
    try:
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)
    except RuntimeError as e:
        print(e)

# PyTorch GPU usage
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)

# Check CUDA availability
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"CUDA version: {torch.version.cuda}")
print(f"Device count: {torch.cuda.device_count()}")

Mixed Precision Training

# TensorFlow
from tensorflow.keras import mixed_precision

policy = mixed_precision.Policy("mixed_float16")
mixed_precision.set_global_policy(policy)

# PyTorch
from torch.cuda.amp import autocast, GradScaler

scaler = GradScaler()

for inputs, labels in trainLoader:
    optimizer.zero_grad()
    
    with autocast():
        outputs = model(inputs)
        loss = criterion(outputs, labels)
    
    scaler.scale(loss).backward()
    scaler.step(optimizer)
    scaler.update()

Model Quantization

# TensorFlow Lite quantization
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tfliteModel = converter.convert()

with open("model.tflite", "wb") as f:
    f.write(tfliteModel)

# PyTorch quantization
quantizedModel = torch.quantization.quantize_dynamic(
    model,
    {torch.nn.Linear},
    dtype=torch.qint8
)

Common Pitfalls

Data Leakage

# ❌ Avoid - scaling before splitting
XScaled = scaler.fit_transform(X)
XTrain, XTest = train_test_split(XScaled, test_size=0.2)

# ✅ Correct - fit on training data only
XTrain, XTest = train_test_split(X, test_size=0.2)
XTrain = scaler.fit_transform(XTrain)
XTest = scaler.transform(XTest)

Overfitting

# Signs of overfitting
# - High training accuracy, low validation accuracy
# - Large gap between train and validation loss

# Solutions:
# 1. Regularization
model = RandomForestClassifier(
    max_depth=10,  # Limit tree depth
    min_samples_leaf=5,  # Require minimum samples per leaf
    max_features="sqrt"  # Random feature subset
)

# 2. Dropout (neural networks)
model = keras.Sequential([
    layers.Dense(128, activation="relu"),
    layers.Dropout(0.5),
    layers.Dense(64, activation="relu"),
    layers.Dropout(0.3),
    layers.Dense(10, activation="softmax")
])

# 3. Early stopping
# 4. More training data
# 5. Cross-validation

Resources and References

Essential Libraries

scikit-learn: Traditional ML algorithms
TensorFlow: Deep learning framework
PyTorch: Research-focused deep learning
XGBoost: Gradient boosting
LightGBM: Fast gradient boosting
CatBoost: Categorical data boosting

Data Processing

pandas: Data manipulation
NumPy: Numerical computing
Polars: Fast DataFrame library
Dask: Parallel computing for large datasets

Visualization

Matplotlib: Basic plotting
Seaborn: Statistical visualization
Plotly: Interactive plots
Altair: Declarative visualization

Model Tracking

MLflow: ML lifecycle management
Weights & Biases: Experiment tracking
Neptune.ai: ML metadata store
DVC: Data version control

AutoML

Auto-sklearn: Automated sklearn
TPOT: Genetic programming AutoML
H2O AutoML: Enterprise AutoML
PyCaret: Low-code ML library

Learning Resources

Machine Learning Mastery: Practical tutorials
Kaggle Learn: Interactive courses
Fast.ai: Deep learning for coders
Coursera ML Courses: University courses
Papers with Code: Latest research with implementations

Table of Contents