I possess a diverse collection of datasets consisting of CSV files, code sources, text documents, and more. I aim to perform fine-tuning on Mixtral using these datasets. To achieve this, I need guidance on how to create embeddings for each dataset type.