Feature scaling is the process of normalizing or standardizing the input data to ensure that features with different scales do not dominate or skew the training process. Neural networks often perform better when the input features are scaled to a similar range, typically between 0 and 1, or standardized to have zero mean and unit variance.
Scaling helps prevent the model from favoring certain features over others, allowing the optimizer to converge more efficiently. Common techniques for feature scaling include Min-Max scaling and Z-score standardization.
Without proper scaling, the training process may become slow or unstable, especially in deep networks where certain features may have much larger values than others.