"tokens for types of scenes, tokens for images, sounds), can be grouped as" . . . .