Org Chart Hierarchy
I have a use case where I need to retrieve information from an org chart and output it in JSON while maintaining the correct hierarchy.
I was using Mistral 3.1 Small (24B), but it’s 17GB in size, and its accuracy averages 80-90%. I’m considering switching to a smaller model and fine-tuning it—this way, the size would decrease, and accuracy might improve after fine-tuning.
Please help me here. If I’m wrong in any way, feel free to suggest alternatives.
You're right in thinking about switching to a smaller model and fine-tuning it for your specific task. @okayatul
If your main goal is to extract hierarchical relationships and convert them into structured JSON, you might not need a 24B model. A smaller model (like 3B–7B) fine-tuned specifically on your type of org chart data (tasks) could actually perform better than a larger general-purpose model, especially if the domain is narrow or repetitive.
Try using Qwen2-VL or Qwen2.5-VL (e.g., Nanonets, Monkey OCR, etc.). Test them all and find the one that best suits your needs.
Thanks for the reply. However, the issue here is with the dataset—this type of data, specifically organizational charts, isn't available anywhere as far as I can see. So, please guide me on what to do and how to proceed.
Secondly, I have also tried the OCR tools you mentioned, but they didn't return any results when I provided an org chart as input.