--- title: USDA Food Assistant emoji: 🍴 colorFrom: blue colorTo: purple sdk: docker pinned: false --- # USDA Food Assistant The USDA Food Assistant is an interactive tool designed to help users explore detailed food data from the USDA Branded Food Dataset. By combining semantic search with natural language processing, the assistant enables users to retrieve food-specific information and engage in a conversational exploration of nutrients, ingredients, and serving sizes. ## Overview The USDA Food Assistant operates in two main steps: 1. **Data Retrieval**: Users begin by inputting the name of a food item (e.g., “Oreo cookies”), which initiates a semantic search in the Pinecone Vector Store. Using the `multilingual-e5-large` embedding model, the assistant retrieves relevant data, such as ingredients, nutrients, and serving sizes for the specified food item, and loads this information as context for the interaction. 2. **Interactive Conversation**: Once the data is loaded into context, users can ask detailed follow-up questions about the food item. Questions might include: - “What are the vitamins and minerals in this item?” - “How many calories are in a 250-gram serving?” - “Does this food contain any allergens?” Through this structured flow, users gain a comprehensive view of each food item's nutritional profile, making it a valuable tool for informed decision-making regarding food content and nutrition. For more information on the development and structure of this assistant, see the blog post [here](https://jacktol.net/posts/building_a_data_pipeline_for_usda_fooddata_central/). ## Access the Dataset The USDA Branded Food Dataset used by this assistant is available on HuggingFace Datasets [here](https://huggingface.co/datasets/jacktol/usda_branded_food_data). ## See the Code The full code for the data pipeline responsible for creating the dataset, as well as for the USDA Food Assistant, can be found on GitHub [here](https://github.com/jack-tol/usda-food-data-pipeline). ## License This project is licensed under the MIT License.