ontocord (@ontocord) Twitter Tweets • TwiDoom

ontocord

@ontocord

+ Follow

We dedicate ourselves to bringing lawful and effective data to AI training so that everyone can benefit from human knowledge. ontocord.ai

ID: 1384510783166590981

calendar_today20-04-2021 14:14:36

109 Tweet

403 Followers

132 Following

Jenia Jitsev 🏳️‍🌈 🇺🇦

@jjitsev

9 months ago

A lot of struggles in the hopefully upcoming EU AI Act concerns foundation models. I think it is very important to avoid misconception: foundation models are scientific artifacts required for basic research on machine learning, not finished tools to be deployed for end users. 1/5

thumb_up_off_alt24

chat_bubble_outline3

repeat12

shareShare

qnguyen3

@stablequan

9 months ago

Today, I released my first paper, VinaLLaMA. The state-of-the-art LLM for Vietnamese, based on LLaMA-2. Continued pretrain and SFT 100% with synthetic data. Special thanks to Teknium (e/λ) & LDJ. Their OpenHermes and Capybara datasets helped me a lot arxiv.org/abs/2312.11011

thumb_up_off_alt137

chat_bubble_outline12

repeat21

shareShare

LAION

@laion_ai

9 months ago

LAION has a zero tolerance policy for illegal content. We work with organizations like IWF and others to validate links in the LAION datasets with filtering tools developed by our community and partner organizations to ensure they are safe. laion.ai/notes/laion-ma…

thumb_up_off_alt66

chat_bubble_outline3

repeat13

shareShare

Jon Durbin

@jon_durbin

7 months ago

🚀🥯 bagel 20b v0.4 family now available 🥯🚀 Fine-tunes of internlm2-20b, with the latest bagel dataset. Prompting tips and such in the model card. 4 options to choose from: • DPO with internlm2 modeling code huggingface.co/jondurbin/bage… • non-DPO with internlm2 modeling code

thumb_up_off_alt34

chat_bubble_outline5

repeat5

shareShare

SambaNova Systems

@sambanovaai

7 months ago

BigScience Research Workshop We also want to thank ontocord and LAION for open sourcing the alignment data that was used to train the model.

thumb_up_off_alt3

chat_bubble_outline1

repeat1

shareShare

qnguyen3

@stablequan

7 months ago

Hello Vietnam! Both Llama and Mistral-based Vietnamese models from ontocord are now available on ollama Enjoy😉

Hello Vietnam! Both Llama and Mistral-based Vietnamese models from <a href="/ontocord/">ontocord</a> are now available on <a href="/ollama/">ollama</a>
Enjoy😉

thumb_up_off_alt72

chat_bubble_outline8

repeat13

shareShare

OcciGlot

@occiglot

6 months ago

Today, we are announcing Occiglot! A large-scale collaborative research collective focusing on open-source European LLMs. We invite anybody working on multilingual datasets, benchmarks, or models to get in touch/join our discord. occiglot.github.io/occiglot/posts…

thumb_up_off_alt188

chat_bubble_outline10

repeat55

shareShare

EnricoShippole

@enricoshippole

6 months ago

@TeraflopAI is excited to help support the Caselaw Access Project and lil (library innovation lab), in the release of over 6.6 million state and federal court decisions published throughout U.S. history.

@TeraflopAI is excited to help support the <a href="/caselawaccess/">Caselaw Access Project</a> and <a href="/HarvardLIL/">lil (library innovation lab)</a>, in the release of over 6.6 million state and federal court decisions published throughout U.S. history.

thumb_up_off_alt93

chat_bubble_outline3

repeat35

shareShare

ontocord

@ontocord

6 months ago

Announcing our official launch of the Aurora-M series of multilingual models red teamed for the Biden-Harris AI Executive Order concerns. Blog: huggingface.co/blog/mayank-mi… Thanks to the MDEL community & to CSC - IT Center for Science and TurkuNLP for compute.

thumb_up_off_alt22

chat_bubble_outline1

repeat12

shareShare

ontocord

@ontocord

6 months ago

Announcing our print - an Ontocord.AI open science 16b model to prompte lawful AI: **Aurora-M** red-teamed for concerns under #WhiteHouse Executive Order on the Safe, Secure, and Trustworthy development and use of AI - arxiv.org/abs/2404.00399 So proud of our team!

thumb_up_off_alt14

chat_bubble_outline0

repeat3

shareShare

AK

@_akhaliq

6 months ago

Aurora-M The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order Pretrained language models underpin several AI applications, but their high computational cost for training limits accessibility. Initiatives such as BLOOM and

thumb_up_off_alt333

chat_bubble_outline21

repeat65

shareShare

Simone Tedeschi

@simonetedeschi_

5 months ago

For more details, check out our preprint: arxiv.org/pdf/2404.08676… 🤓 Huge thanks to felfri, Patrick, Kristian Kersting, Roberto Navigli, Huu and Bo (and all the organizations involved Babelscape SapienzaNLP TU Darmstadt hessian.AI DFKI ontocord The University of Chicago UIUC NLP)🫂

thumb_up_off_alt4

chat_bubble_outline0

repeat1

shareShare

Jenia Jitsev 🏳️‍🌈 🇺🇦

@jjitsev

20 days ago

LAION-5B is important reference research dataset for reproducible language-vision foundation models studies. We release Re-LAION-5B as a transparent safety iteration on LAION-5B which fixes issues and allows broad research community to continue using open datasets as reference🧵

thumb_up_off_alt281

chat_bubble_outline5

repeat74

shareShare