Detect anime faces and landmarks in an image
Generate speech from text in Japanese or English
Towards Unified Music Emotion Recognition across Dimensional