ML Made Simple in 30 hours: Hour 30 Hyperparameter Optimization

#### Concept

Hyperparameter optimization involves finding the best set of hyperparameters for a machine learning model to maximize its performance. Hyperparameters are parameters set before the learning process begins, affecting the learning algorithm's behavior and model performance.

#### Key Aspects

1. Hyperparameters vs. Parameters:

- Parameters: Learned from data during model training (e.g., weights in neural networks).

- Hyperparameters: Set before training and control the learning process (e.g., learning rate, number of trees in a random forest).

2. Importance of Hyperparameter Tuning:

- Impact on Model Performance: Proper tuning can significantly improve model accuracy and generalization.

- Algorithm Sensitivity: Different algorithms require different hyperparameters for optimal performance.

3. Hyperparameter Optimization Techniques:

- Grid Search: Exhaustively search a predefined grid of hyperparameter values.

- Random Search: Randomly sample hyperparameter combinations from a predefined distribution.

- Bayesian Optimization: Uses probabilistic models to predict the performance of hyperparameter configurations.

- Gradient-based Optimization: Optimizes hyperparameters using gradients derived from the model's performance.

4. Evaluation Metrics:

- Cross-Validation: Assess model performance by splitting the data into multiple subsets (folds).

- Scoring Metrics: Use metrics like accuracy, precision, recall, F1-score, or area under the ROC curve (AUC) to evaluate model performance.

#### Implementation Steps

1. Define Hyperparameters: Identify which hyperparameters need tuning for your specific model and algorithm.

2. Choose Optimization Technique: Select an appropriate technique based on computational resources and model complexity.

3. Search Space: Define the range or values for each hyperparameter to explore during optimization.

4. Evaluation: Evaluate each combination of hyperparameters using cross-validation and chosen evaluation metrics.

5. Select Best Model: Choose the model with the best performance based on the evaluation metrics.

#### Example: Hyperparameter Tuning with Random Search

Let's perform hyperparameter tuning using random search for a Random Forest classifier using scikit-learn.

from sklearn.model_selection import RandomizedSearchCV
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_digits
from sklearn.metrics import accuracy_score
from scipy.stats import randint

# Load dataset
digits = load_digits()
X, y = digits.data, digits.target

# Define model and hyperparameter search space
model = RandomForestClassifier()
param_dist = {
    'n_estimators': randint(10, 200),
    'max_depth': randint(5, 50),
    'min_samples_split': randint(2, 20),
    'min_samples_leaf': randint(1, 20),
    'max_features': ['sqrt', 'log2', None]
}

# Randomized search with cross-validation
random_search = RandomizedSearchCV(model, param_distributions=param_dist,
          n_iter=100, cv=5, scoring='accuracy', verbose=1, n_jobs=-1)
random_search.fit(X, y)

# Print best hyperparameters and score
print("Best Hyperparameters found:")
print(random_search.best_params_)
print("Best Accuracy Score found:")
print(random_search.best_score_)

Result:

Fitting 5 folds for each of 100 candidates, totalling 500 fits
Best Hyperparameters found:
{'max_depth': 23, 'max_features': 'log2', 'min_samples_leaf': 1, 
'min_samples_split': 8, 'n_estimators': 198}
Best Accuracy Score found:
0.937137109254101

Hope You enjoyed Learning Machine Learning!

ML Made Simple in 30 hours

Friday, 3 January 2025

Hour 30 Hyperparameter Optimization

#### Concept

No comments:

Post a Comment

Hour 30 Hyperparameter Optimization

Search This Blog