Since the convergence of the perceptron algorithm doesn't depend on the initialization, the end performance on the training set must be the same. Are the resulting theta 's the same regardless of the initialization?
A. Yes
B. No
Does this necessarily imply that the performance on a test set is the same?
A. Yes
B. No