Missions

1
Cadet210XP

Computer Vision

Robot Vision Challenge

Guided by Fei-Fei Li

Your robot can see, but it can't tell a cat from a couch. Train its eyes to know the difference.

25 min+130 XP

Watch

See it happen in the real world.

Computer vision works by turning a picture into a giant grid of numbers, then looking for patterns in that grid. A cat picture and a couch picture have different patterns of light, edge, and shape. Fei-Fei Li built a dataset called ImageNet with 14 million labeled photos so that computers could learn what 'cat' actually looks like, from a thousand different angles. Before ImageNet, computers were guessing. After ImageNet, they started knowing.

Watch this

Think

A question worth sitting with.

If you showed a robot a picture of a black cat on a black couch, what would it confuse first?

Build

Make something with your hands.

Cut 20 small pictures from a magazine or print them out. Sort them into two piles: 'has a face' and 'no face'. That sorting is what a computer vision model does — except it does it on 14 million pictures.

Step-by-step

  1. Cut or print 20 pictures from magazines or the web. Mix faces, objects, animals, and scenery.

  2. Sort the stack into two piles by hand: HAS FACE and NO FACE. Take your time on the hard ones.

  3. Now pretend you're the robot. Write three rules a computer could use to decide. Example: 'Are there two dark circles for eyes?'

  4. Test your rules on five new pictures. Count how many your rules get right and how many they get wrong.

  5. Fix the rules so they get more right than before. That edit-and-retry loop is exactly what training an AI model does.

A toy classifier (no ML library required)

Python
# Each "image" is described by 3 simple features.
images = [
    {'name': 'cat',    'has_eyes': True,  'has_fur': True,  'is_round': True},
    {'name': 'couch',  'has_eyes': False, 'has_fur': True,  'is_round': False},
    {'name': 'apple',  'has_eyes': False, 'has_fur': False, 'is_round': True},
    {'name': 'friend', 'has_eyes': True,  'has_fur': False, 'is_round': True},
]

def classify(image):
    if image['has_eyes'] and image['has_fur']:
        return 'animal'
    if image['has_eyes']:
        return 'person'
    if image['has_fur']:
        return 'furniture'
    return 'object'

for img in images:
    print(img['name'], '->', classify(img))

Toolkit

  • Paper
  • Python

Play

Test it. See what it does.

Play 'I spy' with a partner using only color and shape clues. You're acting like the robot — you can't say what it is, only how it looks.

Challenge

Push it a little further.

Build a classifier that sorts your sock drawer into matched pairs by color. How few rules does it take?

Reflect

Notice what your robot taught you.

What did your robot learn about how you see?

Also ask yourself

What surprised you?

Reward

Mission outro

+0XP

Real vision systems do this 30 times a second. Yours just did it once. Try faster.

Skills advanced: color vision, object detection

Badge

Vision Coder

Explore another mission →