July 15th, 2020

Question 1: Describe the ImageDataGenerator() command and its associated argument. What objects and arguments do you need to specify in order to ﬂow from the directory to the generated object? What is the signiﬁcance of specifying the target_size = as it relates to your source images of varying sizes? What considerations might you reference when programming the class mode = argument? How diﬀerence exists when applying the ImageDataGenerator() and .ﬂow_from_directory() commands to the training and test datasets?
- ImageDataGenerator() automatically labels an image based on the name of the subdirectory used for efficiency. The generator will transform the images into tensor floats and normalize the pixel values until they are between [0,1]. After rescaling the data, the .flow(data, labels) command can be used in order to extrapolate the source directory and train images in batches. Target_size() will resize the images in order to create uniformity. I do have questions concerning this method; rescaling an image that potentially does not have the same aspect ratio could distort the image and thus the accuracy. Lastly, we must label the classes as either binary or categorical. If there are two class names, we need binary labels. If there were to be more than two categories, categorical labels are necessary. These commands can be applied to both training and test datasets, but respective batch size and directory must be indicated
Question 2: Describe the model architecture of the horses and humans CNN as you have speciﬁed it. Did you modify the number of ﬁlters in your Conv2D layers? How do image sizes decrease as they are passed from each of your Conv2D layers to your MaxPooling2D layer and on to the next iteration? Finally, which activation function have you selected for your output layer? What is the signiﬁcance of this argument’s function within the context of your CNN’s prediction of whether an image is a horse or a human? What functions have you used in the arguments of your model compiler?

- When examining the CNN, we see that there are three layers of convolution. The first layer has 16 filters, the second 32, and the third 64. This is hierarchy is often seen in neural networks, as the first filter serves to filter out the main features, and the next two will filter the images in more depth. Along with the layers of convolution, there are three MaxPooling layers. Amongst these layers, the image is slowly decreasing in size. As seen in the summary, the image first is cut in half its size, and then reduced by two pixels. This pattern continues until the last layer is processed. After the convolutions and pooling, the image is flattened and passed through two dense layers, one with ‘relu’ activation and the other with ‘sigmoid’ activation. Because this was a binary classification model, the network ends with sigmoid activation. Thus, the output will be between 0 and 1. This output then is used to determine whether the image is classified as horse or human. The model however failed when tested. After uploading two images, one of myself and one of my sister, unfortunately we were both classified as horses. Thus, the model must be either under or overfitted. In order to determine whether we need to further train the model or not, we can look at the training accuracy. The last epoch signals that the model reached 100% while the validation accuracy is only at 82%. Thus, we can decipher that the model must be quite overfit. In order to increase the validation score, we can reduce the amount of convolution errors or the number of filters being used. We check that all optimizers, rescalers, and epochs are at an appropriate length, and thus continue from there.

Question 3: Using the auto-mpg dataset (auto-mpg.data), upload the image where you used the seaborn library to pairwise plot the four variables speciﬁed in your model. Describe how you could use this plot to investigate the co-relationship amongst each of your variables. Are you able to identify interactions amongst variables with this plot? What does the diagonal access represent? Explain what this function is describing with regarding to each of the variables.

- Pairwise plots compare any two variables against each other in your data, thus showing the relationship between them. From the plots, we can differentiate between positive and negative correlations, no correlations, and weight of each variable. Plotting the variables can help us determine how to construct our model. Diagonals provide analysis of the data univariately. The graphs on the diagonals are probability distributions of the variable. As seen, the diagonals tend to be skewed right. - Inferences made from the pairwise plot: - Displacement and weight have a logistic relationship - Displacement & MPG and MPG & Weight have an exponential relationship - Displacement and Cylinder function may have a step/piecewise nature

Question 4 After running model.ﬁt() on the auto-mpg.data data object, you returned the hist.tail() from the dataset where the training loss, MAE & MSE were recorded as well as those same variables for the validating dataset. What interpretation can you oﬀer when considering these last 5 observations from the model output? Does the model continue to improve even during each of these last 5 steps? Can you include a plot to illustrate your answer? Stretch goal: include and describe the ﬁnal plot that illustrates the trend of true values to predicted values as overlayed upon the histogram of prediction error.
- When considering the last five observations, the accuracy is decreasing, thus signifying that the model is overfit. Both the MAE and MSE plots show that after 100 epochs, the amount of errors begins to increase.
Question 5 What was the signiﬁcance of comparing the 4 diﬀerent sized models (tiny, small, medium, large)? Can you include a plot to illustrate your answer?
- The comparison graph allows one to pick the optimal model quickly. From the graph we can infer which model reached its peak, which needed more fitting, and which was overfit at whatever epoch chosen. For example, it can be seen that the large model became overfit very quickly in comparison to the others. This graph also allows us to note something that would have been more difficult to gauge without it: trends. With the comparison model, we can note that for this set of data, the larger the model, the more quickly it overfits. Such trends can help us know how to improve the model. For example, let us suggest that I had started with the medium model. Although it would be safe to have guessed that a larger model probably would have overfit the data even more, comparison graphs can confirm that a smaller model will not underfit/overfit the model in this case.