Spaces:

FlavioBF
/

BRUNO_FRANCO_Flavio

Sleeping

FlavioBF commited on Apr 10, 2024

Commit

148a5e5

verified ·

1 Parent(s): acb9a74

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -34,17 +34,17 @@ client = OpenAI(api_key=api_key)
 title = "QUESTION ANSWERIGN WITH RAG TECHNIQUES"
 description = """
-This Space uses:
 FlavioBF/multi-qa-mpnet-base-dot-v1_fine_tuned_model
 ChatGPT 3.5-turbo
-to get insight from the Enron Mail dataset. In particular:\n
-The question asnswer model multi-qa-mpnet-base-dot-v1 has been been used to create embeddings of a small subset of the dataset and, then, trained over a sampled dataset of 500k istances of the original Enron dataset
-Embedded content is used to retrieve context that will be used by the downsrteam processing using similarity analysis (metric = dot product). \n
 The chunk size of 500 chars used in the text splitters, is probably too small to capture properly sentences in an effective way, neverthless it has been kept. \n
-Further tests would have been required with 1000k and 1.5k chars.
-To answer to quesitons from Enron dataset, both mnodels are using the context genrated by RAG technique. \n
-REMARK: due to the limited storage capacity the context can be generated only over a limited number of mails. The GPT 3.5 turbo model has been instructed to avoi to make up an answer base on its own trauined data
 """
 examples=[

 title = "QUESTION ANSWERIGN WITH RAG TECHNIQUES"
 description = """
+This Space uses:\n
 FlavioBF/multi-qa-mpnet-base-dot-v1_fine_tuned_model
 ChatGPT 3.5-turbo
+to get insight from the Enron Mail dataset, in particular:\n
+The question asnswer model multi-qa-mpnet-base-dot-v1 has been been used to create embeddings of a small subset of the dataset and, then, trained over a sampled dataset of aapx. 500k istances of the original Enron dataset
+Embedded content is used to retrieve context that will be used by the downnstream processing using similarity analysis (metric = dot product). \n
 The chunk size of 500 chars used in the text splitters, is probably too small to capture properly sentences in an effective way, neverthless it has been kept. \n
+To answer to questions from Enron dataset, both mnodels are using the context generated using RAG technique. \n
+REMARK: due to the limited storage capacity the context can be generated only over a limited number of mails. The GPT 3.5 turbo model has been instructed to avoid to make up answers in case contecxt is not clear
 """
 examples=[