1 5 1

Sainbayar B. (Б. Сайнбаяр)

onlysainaa

sainaaBL3SSyou

AI & ML interests

https://www.instagram.com/only_sainaa/

Recent Activity

updated a model about 1 month ago

onlysainaa/outputs

published a model about 1 month ago

onlysainaa/outputs

updated a model about 1 month ago

onlysainaa/gemma3-12b-unsloth-finetuned

View all activity

Organizations

onlysainaa's activity

updated a model about 1 month ago

onlysainaa/outputs

Updated May 6

published a model about 1 month ago

onlysainaa/outputs

Updated May 6

updated a model about 1 month ago

onlysainaa/gemma3-12b-unsloth-finetuned

Text Generation • Updated May 2 • 6

published a model about 1 month ago

onlysainaa/gemma3-12b-unsloth-finetuned

Text Generation • Updated May 2 • 6

reacted to as-cle-bert's post with ❤️ about 1 month ago

Post

1893

One of the biggest challenges I've been facing since I started developing [𝐏𝐝𝐟𝐈𝐭𝐃𝐨𝐰𝐧](https://github.com/AstraBert/PdfItDown) was handling correctly the conversion of files like Excel sheets and CSVs: table conversion was bad and messy, almost unusable for downstream tasks🫣

That's why today I'm excited to introduce 𝐫𝐞𝐚𝐝𝐞𝐫𝐬, the new feature of PdfItDown v1.4.0!🎉

With 𝘳𝘦𝘢𝘥𝘦𝘳𝘴, you can choose among three (for now👀) flavors of text extraction and conversion to PDF:

- 𝗗𝗼𝗰𝗹𝗶𝗻𝗴, which does a fantastic work with presentations, spreadsheets and word documents🦆

- 𝗟𝗹𝗮𝗺𝗮𝗣𝗮𝗿𝘀𝗲 by LlamaIndex, suitable for more complex and articulated documents, with mixture of texts, images and tables🦙

- 𝗠𝗮𝗿𝗸𝗜𝘁𝗗𝗼𝘄𝗻 by Microsoft, not the best at handling highly structured documents, by extremly flexible in terms of input file format (it can even convert XML, JSON and ZIP files!)✒️

You can use this new feature in your python scripts (check the attached code snippet!😉) and in the command line interface as well!🐍

Have fun and don't forget to star the repo on GitHub ➡️ https://github.com/AstraBert/PdfItDown