risedangel's picture

risedangel

risedangel

AI & ML interests

None yet

Recent Activity

liked a dataset about 19 hours ago
selimfirat/bilkent-turkish-writings-dataset
reacted to ZennyKenny's post with ❤️ 2 days ago
🖤 Probably one of my favorite projects that I've worked on so far, introducing Новояз (Novoyaz). 🛠 One of the first acts of the Bolshevik government after the Russian Revolution was the reform and standardization of the Russian language, which at the time had a non-standard and challenging orthography. 📚 Upon its reform the government launched a nationwide campaign called Ликбез (Likbez), which sought to improve literacy in the country (by the way, it worked, bringing the national literacy rate from <20% in the 1920s to >80% by the 1930s). ‼ While this is a remarkable result that should absolutely be celebrated, it's one that has left behind literally hundreds of thousands if not millions of artifacts using pre-reform Russian orthography. 😓 Researchers and historians are working tirelessly to translate these artifacts to modern Russian so that they may be archived and studied but many have told me that. they are doing this BY HAND (!). 💡 I thought, well this is a perfect use case for OCR and a fine-tuned LLM to step in and help to aid in this important work! 🌏 Introducing НОВОЯЗ (NOVOYAZ)! Powered by https://huggingface.co/ChatDOC/OCRFlux-3B and https://huggingface.co/ZennyKenny/oss-20b-prereform-to-modern-ru-merged, researchers can now convert images of their pre-reform documents to modern Russian orthography using the power of open-source AI! Check it out and drop a like to support more real-world use cases for open source AI outside of traditional tech-centric domains! https://huggingface.co/spaces/ZennyKenny/Novoyaz
View all activity

Organizations

None yet