Filters in a CNN are automatically learned during training, but their size and type depend on the task. Start with standard filter sizes like 3x3 or 5x5, as these are effective for capturing spatial features. Use smaller filters in initial layers to detect basic features like edges and larger filters in deeper layers for complex patterns.
Adjust the number of filters per layer based on the input data’s complexity and the network’s depth. For example, shallow networks may require fewer filters, while deeper networks need more filters to capture diverse features.
Experiment with filter configurations during model tuning. Use tools like Grad-CAM or feature visualization to understand how filters respond to input data, enabling refinement of the architecture.