Jukebox - music-making tool by OpenAI. While the achievement is significant from a technological perspective, the results are unlikely to threaten the livelihoods of human musicians.
DALL·E generates images in response to written inputs, and (whose name honours both Salvador Dalí and Pixar’s WALL·E) is a decoder-only transformer model. From Andrew Ng's 'The Batch' newsletter: OpenAI trained it on images with text captions taken from the internet. Given a sequence of tokens that represent a text and/or image, it predicts the next token. Then it predicts the next token given its previous prediction and all previous tokens. This allows DALL·E to generate images from a wide range of text prompts and to generate fanciful images that aren’t represented in its training data, such as “an armchair in the shape of an avocado.” WHY it matters? As Ilya Sutskever puts it ‘combining language and vision techniques could overcome computer vision’s need for large, well labeled datasets’.