Sainbayar B. (Б. Сайнбаяр)'s picture

Sainbayar B. (Б. Сайнбаяр)

onlysainaa

AI & ML interests

https://www.instagram.com/only_sainaa/

Recent Activity

updated a model about 1 month ago
onlysainaa/outputs
published a model about 1 month ago
onlysainaa/outputs
updated a model about 1 month ago
onlysainaa/gemma3-12b-unsloth-finetuned
View all activity

Organizations

Stable Diffusion Dreambooth Concepts Library's profile picture

onlysainaa's activity

updated a model about 1 month ago
published a model about 1 month ago
reacted to as-cle-bert's post with ❤️ about 1 month ago
view post
Post
1893
One of the biggest challenges I've been facing since I started developing [𝐏𝐝𝐟𝐈𝐭𝐃𝐨𝐰𝐧](https://github.com/AstraBert/PdfItDown) was handling correctly the conversion of files like Excel sheets and CSVs: table conversion was bad and messy, almost unusable for downstream tasks🫣

That's why today I'm excited to introduce 𝐫𝐞𝐚𝐝𝐞𝐫𝐬, the new feature of PdfItDown v1.4.0!🎉

With 𝘳𝘦𝘢𝘥𝘦𝘳𝘴, you can choose among three (for now👀) flavors of text extraction and conversion to PDF:

- 𝗗𝗼𝗰𝗹𝗶𝗻𝗴, which does a fantastic work with presentations, spreadsheets and word documents🦆

- 𝗟𝗹𝗮𝗺𝗮𝗣𝗮𝗿𝘀𝗲 by LlamaIndex, suitable for more complex and articulated documents, with mixture of texts, images and tables🦙

- 𝗠𝗮𝗿𝗸𝗜𝘁𝗗𝗼𝘄𝗻 by Microsoft, not the best at handling highly structured documents, by extremly flexible in terms of input file format (it can even convert XML, JSON and ZIP files!)✒️

You can use this new feature in your python scripts (check the attached code snippet!😉) and in the command line interface as well!🐍

Have fun and don't forget to star the repo on GitHub ➡️ https://github.com/AstraBert/PdfItDown
upvoted an article 6 months ago
view article
Article

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

By mlabonne
331