Model Parameter Reference¶

This guide provides comprehensive parameter references for all supported models in GlassAlpha.

!!! info "Quick Links" - Choosing a model? → Start with Model Selection Guide - Need configuration help? → See Configuration Guide - Getting started? → Try the Quick Start Guide first

Understanding Parameter Validation¶

GlassAlpha passes model parameters directly to the underlying ML libraries (scikit-learn, XGBoost, LightGBM). These libraries have different behaviors:

scikit-learn: Ignores unknown parameters (no error, may log warning)
XGBoost: Warns about unknown parameters in logs
LightGBM: Warns about unknown parameters in logs

Check Your Logs

If you see warnings in your audit output about unknown parameters, double-check your parameter names against this reference.

LogisticRegression¶

Logistic Regression is the baseline model for binary and multiclass classification. It's interpretable, fast, and works well for linearly separable data.

Common Parameters¶

Parameter	Type	Default	Valid Range	Description
`random_state`	int	None	0-2³¹	Random seed for reproducibility
`max_iter`	int	100	1-10000	Maximum iterations for solver
`C`	float	1.0	0.001-100.0	Inverse regularization strength (smaller = more regularization)
`penalty`	str	'l2'	'l1', 'l2', 'elasticnet', 'none'	Regularization type
`solver`	str	'lbfgs'	'lbfgs', 'liblinear', 'saga', 'newton-cg'	Optimization algorithm
`tol`	float	0.0001	1e-6 - 0.01	Tolerance for stopping criteria
`fit_intercept`	bool	True	True, False	Whether to calculate intercept
`class_weight`	str/dict	None	'balanced', dict	Weights for imbalanced classes

Usage Example¶

model:
  type: logistic_regression
  params:
    random_state: 42
    max_iter: 1000
    C: 1.0
    penalty: "l2"
    solver: "lbfgs"

Common Mistakes¶

❌ Wrong:

model:
  type: logistic_regression
  params:
    max_iter: -1 # ❌ Negative value causes error
    regul: 1.0 # ❌ Typo - should be 'C'
    iterations: 1000 # ❌ Wrong name - should be 'max_iter'
    C: 0 # ❌ Zero is invalid (must be > 0)

✅ Correct:

model:
  type: logistic_regression
  params:
    random_state: 42
    max_iter: 1000
    C: 1.0
    penalty: "l2"

Solver Compatibility¶

Solver	l1	l2	elasticnet	none	Multiclass
`lbfgs`	❌	✅	❌	✅	✅
`liblinear`	✅	✅	❌	❌	❌ (binary only)
`saga`	✅	✅	✅	✅	✅
`newton-cg`	❌	✅	❌	✅	✅

Full Reference¶

See scikit-learn LogisticRegression

XGBoost¶

XGBoost is a powerful gradient boosting library that works well for structured/tabular data.

Common Parameters¶

Parameter	Type	Default	Valid Range	Description
`random_state`	int	0	0-2³¹	Random seed
`n_estimators`	int	100	10-1000	Number of boosting trees
`max_depth`	int	6	2-15	Maximum tree depth
`learning_rate`	float	0.3	0.01-0.3	Step size shrinkage (eta)
`subsample`	float	1.0	0.5-1.0	Fraction of samples for each tree
`colsample_bytree`	float	1.0	0.5-1.0	Fraction of features for each tree
`min_child_weight`	int	1	1-10	Minimum sum of instance weight in child
`gamma`	float	0	0-10	Minimum loss reduction for split
`reg_alpha`	float	0	0-10	L1 regularization
`reg_lambda`	float	1	0-10	L2 regularization

Usage Example¶

model:
  type: xgboost
  params:
    random_state: 42
    n_estimators: 100
    max_depth: 6
    learning_rate: 0.1
    subsample: 0.8
    colsample_bytree: 0.8

Performance Tuning¶

Fast training (less accuracy):

params:
  n_estimators: 50
  max_depth: 3
  learning_rate: 0.3

Better accuracy (slower):

params:
  n_estimators: 500
  max_depth: 8
  learning_rate: 0.01
  subsample: 0.8
  colsample_bytree: 0.8

Prevent overfitting:

params:
  max_depth: 3 # Shallower trees
  min_child_weight: 5 # Require more samples per leaf
  gamma: 1 # Higher split threshold
  subsample: 0.7 # Use less data per tree
  reg_alpha: 1 # L1 regularization
  reg_lambda: 1 # L2 regularization

Common Mistakes¶

❌ Wrong:

model:
  type: xgboost
  params:
    n_trees: 100 # ❌ Wrong name - should be 'n_estimators'
    depth: 6 # ❌ Wrong name - should be 'max_depth'
    learning_rate: 5.0 # ❌ Too high (typically 0.01-0.3)
    max_depth: -1 # ❌ Negative depth invalid for XGBoost

✅ Correct:

model:
  type: xgboost
  params:
    n_estimators: 100
    max_depth: 6
    learning_rate: 0.1

Full Reference¶

See XGBoost Parameters

LightGBM¶

LightGBM is a fast gradient boosting framework that uses tree-based learning algorithms.

Common Parameters¶

Parameter	Type	Default	Valid Range	Description
`random_state`	int	None	0-2³¹	Random seed
`n_estimators`	int	100	10-1000	Number of boosting trees
`max_depth`	int	-1	-1, 2-15	Maximum tree depth (-1 = no limit)
`learning_rate`	float	0.1	0.01-0.3	Boosting learning rate
`num_leaves`	int	31	2-131072	Maximum leaves in one tree
`subsample`	float	1.0	0.5-1.0	Fraction of data for training
`colsample_bytree`	float	1.0	0.5-1.0	Fraction of features for each tree
`min_child_samples`	int	20	1-100	Minimum samples in one leaf
`reg_alpha`	float	0.0	0-10	L1 regularization
`reg_lambda`	float	0.0	0-10	L2 regularization

Usage Example¶

model:
  type: lightgbm
  params:
    random_state: 42
    n_estimators: 100
    num_leaves: 31
    learning_rate: 0.1
    subsample: 0.8
    colsample_bytree: 0.8

Performance Tuning¶

Fast training:

params:
  n_estimators: 50
  num_leaves: 15
  learning_rate: 0.1

Better accuracy:

params:
  n_estimators: 500
  num_leaves: 63
  learning_rate: 0.01
  subsample: 0.8
  min_child_samples: 10

Prevent overfitting:

params:
  num_leaves: 15 # Fewer leaves
  min_child_samples: 50 # More samples per leaf
  subsample: 0.7 # Use less data
  reg_alpha: 1 # L1 regularization
  reg_lambda: 1 # L2 regularization

Important Notes¶

max_depth = -1

In LightGBM, max_depth: -1 means no limit on tree depth. This is different from XGBoost where negative values are invalid.

num_leaves vs max_depth

LightGBM grows trees leaf-wise (best-first) rather than level-wise (depth-first). You typically control tree complexity via num_leaves rather than max_depth.

Common Mistakes¶

❌ Wrong:

model:
  type: lightgbm
  params:
    n_trees: 100 # ❌ Wrong name - should be 'n_estimators'
    max_leaves: 31 # ❌ Wrong name - should be 'num_leaves'
    learning_rate: -0.1 # ❌ Negative learning rate

✅ Correct:

model:
  type: lightgbm
  params:
    n_estimators: 100
    num_leaves: 31
    learning_rate: 0.1

Full Reference¶

See LightGBM Parameters

Parameter Validation Tips¶

Check Your Config¶

Use glassalpha validate to check your configuration:

glassalpha validate --config my_config.yaml

Enable Verbose Logging¶

See parameter warnings in your logs:

logging:
  level: INFO # Or DEBUG for more detail

Common Parameter Mistakes¶

Typos in parameter names - Double-check spelling
Using wrong library's parameter names - Each library has different conventions
Invalid value ranges - Check the valid range column in tables above
Negative values where not allowed - Most counts/sizes must be positive
Wrong types - Use int for counts, float for rates

Getting Help¶

If you're unsure about a parameter:

Check this reference guide
Run glassalpha validate --config your_config.yaml
Check official library documentation
See our FAQ for common issues

Model Parameter Reference¶

Understanding Parameter Validation¶

LogisticRegression¶

Common Parameters¶

Usage Example¶

Common Mistakes¶

Solver Compatibility¶

Full Reference¶

XGBoost¶

Common Parameters¶

Usage Example¶

Performance Tuning¶

Common Mistakes¶

Full Reference¶

LightGBM¶

Common Parameters¶

Usage Example¶

Performance Tuning¶

Important Notes¶

Common Mistakes¶

Full Reference¶

Parameter Validation Tips¶

Check Your Config¶

Enable Verbose Logging¶

Common Parameter Mistakes¶

Getting Help¶

See Also¶