m-ric HF staff commited on
Commit
d9f26aa
β€’
1 Parent(s): a7f608a

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +18 -4
app.py CHANGED
@@ -62,6 +62,15 @@ def chunk(text, length, splitter_selection, separators_str, length_unit_selectio
62
  output = [((split[0], 'Overlap') if split[1] else (split[0], f"Chunk {str(i)}")) for i, split in enumerate(unoverlapped_text_splits)]
63
  return output
64
 
 
 
 
 
 
 
 
 
 
65
 
66
  EXAMPLE_TEXT = """### Chapter 6
67
 
@@ -70,9 +79,9 @@ WHAT SORT OF DESPOTISM DEMOCRATIC NATIONS HAVE TO FEAR
70
  I had remarked during my stay in the United States that a democratic state of society, similar to that of the Americans, might offer singular facilities for the establishment of despotism; and I perceived, upon my return to Europe, how much use had already been made, by most of our rulers, of the notions, the sentiments, and the wants created by this same social condition, for the purpose of extending the circle of their power. This led me to think that the nations of Christendom would perhaps eventually undergo some oppression like that which hung over several of the nations of the ancient world.
71
  A more accurate examination of the subject, and five years of further meditation, have not diminished my fears, but have changed their object.
72
  No sovereign ever lived in former ages so absolute or so powerful as to undertake to administer by his own agency, and without the assistance of intermediate powers, all the parts of a great empire; none ever attempted to subject all his subjects indiscriminately to strict uniformity of regulation and personally to tutor and direct every member of the community. The notion of such an undertaking never occurred to the human mind; and if any man had conceived it, the want of information, the imperfection of the administrative system, and, above all, the natural obstacles caused by the inequality of conditions would speedily have checked the execution of so vast a design.
73
- """
74
 
75
- EXAMPLE_MARKDOWN = """
 
76
  ### Challenges of agent systems
77
 
78
  Generally, the difficult parts of running an agent system for the LLM engine are:
@@ -120,13 +129,13 @@ with gr.Blocks(theme=gr.themes.Soft(text_size='lg', font=["monospace"], primary_
120
  )
121
  separators_selection = gr.Textbox(
122
  elem_id="textbox_id",
123
- value=["### ", "\n\n", "\n", ".", " "],
124
  info="Separators used in RecursiveCharacterTextSplitter",
125
  show_label=False, # or set label to an empty string if you want to keep its space
126
  visible=False,
127
  )
128
  preset_selection = gr.Radio(
129
- ['Text', 'Code', 'Markdown'],
130
  label="Choose a preset",
131
  info="This will choose RecursiveCharacterTextSplitter with a specific set of separators."
132
  )
@@ -158,6 +167,11 @@ with gr.Blocks(theme=gr.themes.Soft(text_size='lg', font=["monospace"], primary_
158
  inputs=[text, slider_count, split_selection, separators_selection, length_unit_selection, chunk_overlap],
159
  outputs=[separators_selection, out],
160
  )
 
 
 
 
 
161
  gr.on(
162
  [text.change, length_unit_selection.change, separators_selection.change, slider_count.change, chunk_overlap.change],
163
  chunk,
 
62
  output = [((split[0], 'Overlap') if split[1] else (split[0], f"Chunk {str(i)}")) for i, split in enumerate(unoverlapped_text_splits)]
63
  return output
64
 
65
+ def change_preset_separators(choice):
66
+ text_splitter = RecursiveCharacterTextSplitter()
67
+ if choice == "Default":
68
+ return ["\n\n", "\n", " ", ""]
69
+ elif choice == "Markdown":
70
+ return text_splitter.get_separators_for_language(Language.MARKDOWN)
71
+ elif choice == "Python":
72
+ return text_splitter.get_separators_for_language(Language.PYTHON)
73
+
74
 
75
  EXAMPLE_TEXT = """### Chapter 6
76
 
 
79
  I had remarked during my stay in the United States that a democratic state of society, similar to that of the Americans, might offer singular facilities for the establishment of despotism; and I perceived, upon my return to Europe, how much use had already been made, by most of our rulers, of the notions, the sentiments, and the wants created by this same social condition, for the purpose of extending the circle of their power. This led me to think that the nations of Christendom would perhaps eventually undergo some oppression like that which hung over several of the nations of the ancient world.
80
  A more accurate examination of the subject, and five years of further meditation, have not diminished my fears, but have changed their object.
81
  No sovereign ever lived in former ages so absolute or so powerful as to undertake to administer by his own agency, and without the assistance of intermediate powers, all the parts of a great empire; none ever attempted to subject all his subjects indiscriminately to strict uniformity of regulation and personally to tutor and direct every member of the community. The notion of such an undertaking never occurred to the human mind; and if any man had conceived it, the want of information, the imperfection of the administrative system, and, above all, the natural obstacles caused by the inequality of conditions would speedily have checked the execution of so vast a design.
 
82
 
83
+ ---
84
+
85
  ### Challenges of agent systems
86
 
87
  Generally, the difficult parts of running an agent system for the LLM engine are:
 
129
  )
130
  separators_selection = gr.Textbox(
131
  elem_id="textbox_id",
132
+ value=["\n\n", "\n", " ", ""],
133
  info="Separators used in RecursiveCharacterTextSplitter",
134
  show_label=False, # or set label to an empty string if you want to keep its space
135
  visible=False,
136
  )
137
  preset_selection = gr.Radio(
138
+ ['Default', 'Python', 'Markdown'],
139
  label="Choose a preset",
140
  info="This will choose RecursiveCharacterTextSplitter with a specific set of separators."
141
  )
 
167
  inputs=[text, slider_count, split_selection, separators_selection, length_unit_selection, chunk_overlap],
168
  outputs=[separators_selection, out],
169
  )
170
+ preset_selection.change(
171
+ fn=change_preset_separators,
172
+ inputs=preset_selection,
173
+ outputs=separators_selection,
174
+ )
175
  gr.on(
176
  [text.change, length_unit_selection.change, separators_selection.change, slider_count.change, chunk_overlap.change],
177
  chunk,