ontocord (@ontocord) 's Twitter Profile
ontocord

@ontocord

We dedicate ourselves to bringing lawful and effective data to AI training so that everyone can benefit from human knowledge. ontocord.ai

ID: 1384510783166590981

calendar_today20-04-2021 14:14:36

109 Tweet

403 Followers

132 Following

Jenia Jitsev 🏳️‍🌈 🇺🇦 (@jjitsev) 's Twitter Profile Photo

A lot of struggles in the hopefully upcoming EU AI Act concerns foundation models. I think it is very important to avoid misconception: foundation models are scientific artifacts required for basic research on machine learning, not finished tools to be deployed for end users. 1/5

qnguyen3 (@stablequan) 's Twitter Profile Photo

Today, I released my first paper, VinaLLaMA. The state-of-the-art LLM for Vietnamese, based on LLaMA-2. Continued pretrain and SFT 100% with synthetic data. Special thanks to Teknium (e/λ) & LDJ. Their OpenHermes and Capybara datasets helped me a lot arxiv.org/abs/2312.11011

LAION (@laion_ai) 's Twitter Profile Photo

LAION has a zero tolerance policy for illegal content. We work with organizations like IWF and others to validate links in the LAION datasets with filtering tools developed by our community and partner organizations to ensure they are safe. laion.ai/notes/laion-ma…

Jon Durbin (@jon_durbin) 's Twitter Profile Photo

🚀🥯 bagel 20b v0.4 family now available 🥯🚀 Fine-tunes of internlm2-20b, with the latest bagel dataset. Prompting tips and such in the model card. 4 options to choose from: • DPO with internlm2 modeling code huggingface.co/jondurbin/bage… • non-DPO with internlm2 modeling code

🚀🥯 bagel 20b v0.4 family now available 🥯🚀

Fine-tunes of internlm2-20b, with the latest bagel dataset.

Prompting tips and such in the model card.

4 options to choose from:
• DPO with internlm2 modeling code huggingface.co/jondurbin/bage…
• non-DPO with internlm2 modeling code
OcciGlot (@occiglot) 's Twitter Profile Photo

Today, we are announcing Occiglot! A large-scale collaborative research collective focusing on open-source European LLMs. We invite anybody working on multilingual datasets, benchmarks, or models to get in touch/join our discord. occiglot.github.io/occiglot/posts…

ontocord (@ontocord) 's Twitter Profile Photo

Announcing our official launch of the Aurora-M series of multilingual models red teamed for the Biden-Harris AI Executive Order concerns. Blog: huggingface.co/blog/mayank-mi… Thanks to the MDEL community & to CSC - IT Center for Science and TurkuNLP for compute.

Announcing our official launch of the  Aurora-M  series of multilingual models red teamed for the Biden-Harris AI Executive Order concerns.
Blog: huggingface.co/blog/mayank-mi… 
Thanks to the MDEL community & to CSC - IT Center for Science and TurkuNLP for compute.
ontocord (@ontocord) 's Twitter Profile Photo

Announcing our print - an Ontocord.AI open science 16b model to prompte lawful AI: **Aurora-M** red-teamed for concerns under #WhiteHouse Executive Order on the Safe, Secure, and Trustworthy development and use of AI - arxiv.org/abs/2404.00399 So proud of our team!

AK (@_akhaliq) 's Twitter Profile Photo

Aurora-M The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order Pretrained language models underpin several AI applications, but their high computational cost for training limits accessibility. Initiatives such as BLOOM and

Aurora-M

The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

Pretrained language models underpin several AI applications, but their high computational cost for training limits accessibility. Initiatives such as BLOOM and
Jenia Jitsev 🏳️‍🌈 🇺🇦 (@jjitsev) 's Twitter Profile Photo

LAION-5B is important reference research dataset for reproducible language-vision foundation models studies. We release Re-LAION-5B as a transparent safety iteration on LAION-5B which fixes issues and allows broad research community to continue using open datasets as reference🧵