Training and Evaluating Machine Learning (ML) Models

Module Overview

In this module, students will create their own machine learning models to classify fossil shark teeth using Google’s Teachable Machine (GTM). Students will upload images to GTM to produce an image-based computer vision model. These models will reflect the paleontologist classification schemes (taxonomy and functional morphology) that were described in Module 2. In the next module, we will explore different techniques and tools for identifying sources of bias in these models.

Driving Question

Can we create a model to identify fossil shark teeth?

Primary Learning Objectives

  • Understand the input and output data in machine learning models
  • Recognize that data organization is determined by the research question being asked
  • Interpret the model output data (model predictions) and evaluate the classification accuracy

Materials

In-Class Lesson Guide

Activity 1: Train a Shark Tooth Classification Model

  • Review the terms artificial intelligence, machine learning, and computer vision.
  • The first model we will create will be a species-based model to identify five different types of sharks and rays. (These are the same species included in the Shark Tooth Kits).
  • Model creation can be freeform or structured.
    • For a freeform experimentation, provide students with access to images that were derived from the Florida Museum, Calvert Marine Museum, and the myFOSSIL eMuseum.
    • For a structured experimentation, provide students with pre-selected images from these sources.
  • Go to Google’s Teachable Machine and start an image-based model.
  • Label each class as a different species: Gray Shark, Snaggletooth Shark, Sand Tiger Shark, Megatooth Shark, and Eagle Ray.
  • Upload images of each species to their respective classes. (Recommend having at least 30 images per class).
    • Images are available in this Google Drive folder.
  • Before you train your model, select “Save Project to Drive” in the top left corner. (This will let you re-upload your model if you want to make changes later).
  • If you don’t want to upload the images yourself, you can load this pre-made dataset.
  • Click “Train Model” to generate your machine learning model. (This may take a couple of minutes but should be fairly quick).

Activity 2: Train a Shark Tooth Function Model

  • The second model we will create will be a tooth function model to identify cutting, grasping, and crushing type teeth.
  • This model will use the same images as the previous model, but they will be re-organized into these functional categories.
  • Go to Google’s Teachable Machine and start an image-based model. 
  • Label each class as a different tooth function: cutting, grasping, and crushing.
  • Upload images of each tooth function to their respective classes. (Recommend having at least 30 images per class).
  • Before you train your model, select “Save Project to Drive” in the top left corner. (This will let you re-upload your model if you want to make changes later). 
  • If you don’t want to upload the images yourself, you can load this pre-made dataset.
  • Click “Train Model” to generate your machine learning model. (This may take a couple of minutes but should be fairly quick).

Activity 3: Evaluating Model Accuracy

  • We are now going to evaluate our model’s accuracy. (You can choose either of the models created in the previous activities).
  • From this page, select “Open an existing project from Drive.” and upload one of the models that you created in the previous activity. Or load one of the pre-made models. 
  • To test the model, you can use the teeth from the Shark Tooth Kits or you can use images from the internet. (Make sure these are not the same images used to train your model.)
    • If you choose to use actual teeth, you can hold them up to the computer webcam or take a photo and upload it.
  • Record the number of correct and incorrect predictions made by the model.
  • To calculate your model’s overall accuracy, divide the number of correct identifications by the total number of images tested. (E.g., if you tested 10 images and the model correctly identified 7, then your model is 70% accurate).
    • Recommend testing each class in your model with at least 5 images.