NextGenC commited on
Commit
d76870f
Β·
verified Β·
1 Parent(s): 43c551d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +61 -0
README.md CHANGED
@@ -117,3 +117,64 @@ The system is modular, consisting of several Python components:
117
  - **Temporal Analysis: Enhance src/analysis/temporal.py with different trend detection algorithms.**
118
  - **Visualization: Customize graph appearance in src/visualization/plotting.py.**
119
  - **Data Storage: Modify src/data_management/storage.py to use different formats or databases.**
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
117
  - **Temporal Analysis: Enhance src/analysis/temporal.py with different trend detection algorithms.**
118
  - **Visualization: Customize graph appearance in src/visualization/plotting.py.**
119
  - **Data Storage: Modify src/data_management/storage.py to use different formats or databases.**
120
+
121
+ ## Folder
122
+ C:.
123
+ β”‚ requirements.txt # Project dependencies
124
+ β”‚ reset_status.py # Utility script (optional)
125
+ β”‚ run_analysis.py # Script to run the analysis pipeline
126
+ β”‚ run_extractor.py # Script to run the extraction pipeline
127
+ β”‚ run_loader.py # Script to run the data loading pipeline
128
+ β”‚ README.md # Project description (This file!)
129
+ β”‚ .gitignore # Files to ignore for Git
130
+ β”‚
131
+ β”œβ”€β”€β”€data # Data directory
132
+ β”‚ β”œβ”€β”€β”€processed_data # Processed data output from scripts
133
+ β”‚ β”‚ analysis_*.parquet # Analysis results
134
+ β”‚ β”‚ concepts.parquet
135
+ β”‚ β”‚ concept_embeddings.pkl
136
+ β”‚ β”‚ concept_similarities.parquet
137
+ β”‚ β”‚ documents.parquet
138
+ β”‚ β”‚ mentions.parquet
139
+ β”‚ β”‚ relationships.parquet
140
+ β”‚ β”‚
141
+ β”‚ └───raw # Raw input data (e.g., PDFs)
142
+ β”‚ example.pdf # Place your input PDFs here
143
+ β”‚
144
+ β”œβ”€β”€β”€notebooks # Jupyter notebooks for exploration/testing (optional)
145
+ β”‚ exploration.ipynb
146
+ β”‚
147
+ β”œβ”€β”€β”€output # Output files generated by analysis
148
+ β”‚ β”‚ *.png # Image outputs (if any)
149
+ β”‚ β”‚
150
+ β”‚ β”œβ”€β”€β”€graphs # Interactive graph visualizations
151
+ β”‚ β”‚ concept_network_visualization.html
152
+ β”‚ β”‚
153
+ β”‚ └───networks # Saved network data
154
+ β”‚ concept_network.pkl
155
+ β”‚
156
+ └───src # Source code directory
157
+ β”‚ __init__.py
158
+ β”‚
159
+ β”œβ”€β”€β”€analysis # Analysis modules
160
+ β”‚ β”‚ __init__.py
161
+ β”‚ β”‚ network_analysis.py # Calculates network metrics
162
+ β”‚ β”‚ network_builder.py # Builds the NetworkX graph
163
+ β”‚ β”‚ similarity.py # Calculates semantic similarity
164
+ β”‚ β”‚ temporal.py # Performs temporal analysis
165
+ β”‚
166
+ β”œβ”€β”€β”€core # Core functionalities/utilities (optional)
167
+ β”‚ β”‚ __init__.py
168
+ β”‚
169
+ β”œβ”€β”€β”€data_management # Data loading and saving modules
170
+ β”‚ β”‚ __init__.py
171
+ β”‚ β”‚ loaders.py # Loads raw data (e.g., PDFs)
172
+ β”‚ β”‚ storage.py # Handles saving/loading processed data (Parquet/Pickle)
173
+ β”‚
174
+ β”œβ”€β”€β”€extraction # Concept and relationship extraction modules
175
+ β”‚ β”‚ __init__.py
176
+ β”‚ β”‚ extractor.py # Main extraction logic using spaCy
177
+ β”‚
178
+ └───visualization # Visualization modules
179
+ β”‚ __init__.py
180
+ β”‚ plotting.py # Generates visualizations (Pyvis, Matplotlib etc.)