I've published an article showing five ways to use 🪢 Langfuse with 🤗 Hugging Face.
My personal favorite is Method #4: Using Hugging Face Datasets for Langfuse Dataset Experiments. This lets you benchmark your LLM app or AI agent with a dataset hosted on Hugging Face. In this example, I chose the GSM8K dataset (openai/gsm8k) to test the mathematical reasoning capabilities of my smolagent :)