The Hidden Auditory Knowledge Inside Language Models

Text-only LLMs may already know enough about sound to predict downstream audio model performance before an encoder is ever attached.

Text-only LLMs may already know enough about sound to predict downstream audio model performance before an encoder is ever attached.