Tom Jobbins (@TheBlokeAI) Twitter Tweets • TwiCopy

Tom Jobbins

@TheBlokeAI

+ Follow

My Hugging Face repos: https://t.co/yh7J4DFGTc
Discord server: https://t.co/5h6rGsGfBx
Patreon: https://t.co/yfQwFggGtx

ID:161524644

linkhttps://www.patreon.com/TheBlokeAI calendar_today01-07-2010 02:27:07

336 Tweets

15,4K Followers

237 Following

Tom Jobbins

9 months ago

Meta's CodeLlama is here!

ai.meta.com/blog/code-llam…

7B, 7B-Instruct, 7B-Python, 13B, 13B-Instruct, 13B-Python, 34B, 34B-Instruct, 34B-Python

First time we've seen the 34B model

I've got a couple of fp16s up:
huggingface.co/TheBloke/CodeL…
huggingface.co/TheBloke/CodeL…

More coming soon obvs

thumb_up_off_alt360

chat_bubble_outline0

account_circle

Tom Jobbins

9 months ago

Transformers 4.32.0 now supports GPTQ models natively!

Over the last couple of days I have updated 296 of my GPTQ repos to provide automatic support for this.

It's awesome you can now load a GPTQ model directly in Transformers with only two lines of code!

thumb_up_off_alt284

chat_bubble_outline0

account_circle

Tom Jobbins

10 months ago

Very interesting new specialist models from WizardLM!

My quantisations are processing and uploading now: 7B done, 13B within 20 mins; 70B in ~2-3 hours.

For GPTQs I used the Camel-AI/Math dataset to hopefully maximise its quantisation quality vs a generic dataset like Wikitext.

thumb_up_off_alt138

chat_bubble_outline0

account_circle

Tom Jobbins

10 months ago

So sad to hear that Bram Moolenaar, creator of Vim, has died.

:%𝚜/𝙱𝚛𝚊𝚖\( 𝙼𝚘𝚘𝚕𝚎𝚗𝚊𝚊𝚛\)\?/&\ (𝟷𝟿𝟼𝟷 - 𝟸0𝟸𝟹)/𝚐

:𝚚𝚊! in peace, Bram.

thumb_up_off_alt98

chat_bubble_outline0

account_circle

Tom Jobbins

10 months ago

A second OpenOrca Llama 2 13B preview!

I have done GGMLs and GPTQs at:
huggingface.co/TheBloke/OpenO…
huggingface.co/TheBloke/OpenO…

thumb_up_off_alt121

chat_bubble_outline0

account_circle

Tom Jobbins

10 months ago

Great to see a new Chronos release for Llama 2!

thumb_up_off_alt19

chat_bubble_outline0

account_circle

chansung

10 months ago

Currently working on adding GPTQ mode in LLM Chatbot! This will give more options to many of you.

Thanks to Tom Jobbins, all models are already quantized in GPTQ, so just need to integrate them.

First, GPU mode will be added. I had some issues with CPU (index out of range)

Currently working on adding GPTQ mode in LLM Chatbot! This will give more options to many of you. Thanks to @TheBlokeAI, all models are already quantized in GPTQ, so just need to integrate them. First, GPU mode will be added. I had some issues with CPU (index out of range)

thumb_up_off_alt41

chat_bubble_outline0

account_circle

Tom Jobbins

10 months ago

Great to see Airoboros Llama 2!

Quantisations here:
huggingface.co/TheBloke/airob…
huggingface.co/TheBloke/airob…
huggingface.co/TheBloke/airob…
huggingface.co/TheBloke/airob…
huggingface.co/TheBloke/airob…

(GGML not possible for 70B HF models yet, I will do it as soon as it's possible)

thumb_up_off_alt77

chat_bubble_outline0

account_circle

Eric Hartford

10 months ago

Today I released Dolphin, an open-source implementation of Microsoft's Orca. huggingface.co/ehartford/dolp…
An Uncensored model, licensed for non-commercial use as it is based on llama1.
I am currently training on llama2 and I require compute sponsorship. Please reach out.

thumb_up_off_alt455

chat_bubble_outline0

account_circle

Tom Jobbins

11 months ago

An exciting few days with Llama 2!

But here's an interesting model that's flown under the radar: Upstage's Llama 30B Instruct. Eval shows it beating 65B & 70B?

Trained on OpenOrca + more.

Original: huggingface.co/upstage/llama-…
huggingface.co/TheBloke/upsta…
huggingface.co/TheBloke/upsta…

thumb_up_off_alt192

chat_bubble_outline0

account_circle

Tim Dettmers

11 months ago

The result of long days of CUDA optimizations: the new bitsandbytes release includes 4-bit inference, which is up to 4.2x faster than 16-bit inference (bsz=1). Full HF integration for all models. No code change needed.

Bnb is growing rapidly, just shy of 1M installs/month🧵

The result of long days of CUDA optimizations: the new bitsandbytes release includes 4-bit inference, which is up to 4.2x faster than 16-bit inference (bsz=1). Full HF integration for all models. No code change needed. Bnb is growing rapidly, just shy of 1M installs/month🧵

thumb_up_off_alt880

chat_bubble_outline0

account_circle

ausboss — e/acc

11 months ago

I've created a KoboldAI API wrapper for @langchain now available in v0.0.231 😀It's compatible with koboldcpp too!

If you are not familiar with KoboldAI its a popular webui for running local models.

Here's a glimpse of it running smoothly in a notebook. 👀 #opensource

I've created a KoboldAI API wrapper for @langchain now available in v0.0.231 😀It's compatible with koboldcpp too! If you are not familiar with KoboldAI its a popular webui for running local models. Here's a glimpse of it running smoothly in a notebook. 👀 #opensource

thumb_up_off_alt30

chat_bubble_outline0

account_circle

Wing Lian (caseus)

11 months ago

OpenOrca 13B Preview1 is here! huggingface.co/Open-Orca/Open…

Collaboration with Alignment Lab AI and Teknium (e/λ) to clean the dataset and getting a SoTA model out. Thanks to Tom Jobbins for quantization.

HF Spaces available to kick the tires on this model at huggingface.co/spaces/openacc…

thumb_up_off_alt48

chat_bubble_outline0

account_circle

Tom Jobbins

11 months ago

The first OpenOrca preview is out! 🚀

'[we] reproduce the dataset generated for Microsoft's Orca Paper. We have trained on less than 6% of our data.. to preview of what is possible..'

My quants:
huggingface.co/TheBloke/OpenO…
huggingface.co/TheBloke/OpenO…
Original:
huggingface.co/Open-Orca/Open…

thumb_up_off_alt151

chat_bubble_outline0

account_circle

Tom Jobbins

11 months ago

I have reached quantisation nirvana.. making 9 GPTQs at once!

This Latitude.sh server is a monster, and it is always hungry! 👹

thumb_up_off_alt245

chat_bubble_outline0

account_circle

Georgi Gerganov

11 months ago

llama.cpp now supports distributed inference across multiple devices via MPI

This is possible thanks to Evan Miller's work. Looking for people to give this a try and attempt to run a 65B LLaMA on cluster of Raspberry Pis 🙃

github.com/ggerganov/llam…

thumb_up_off_alt885

chat_bubble_outline0

account_circle