Philip Vollet (@philipvollet) Twitter Tweets • TwiDoom

Morena

@morenadevil4

9 years ago

Twitter Beğeni Hilesi

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Are you in Austin this week for Confluent Current? Team Weaviate is delivering two talks in the expo hall, where we will showcase a demo app called Wealingo. It provides real-time, personalized language learning using Weaviate's vector database and Confluent Cloud's

thumb_up_off_alt12

chat_bubble_outline3

repeat7

shareShare

Leonie

@helloiamleonie

7 days ago

Different types of embeddings: Sparse: [0, 3, 0, 1, …, 12, 0, 0] Dense: [0.34, -3.75, -0.93, …, 1.53, 0.95] Dense multi-vector (e.g., ColBERT): [[0.01, …, -0.03], … [-0.91, …, 0.23]] Dense with variable dimensions (e.g., Matryoshka): 8 dimensions: [-0.03, -0.42,

thumb_up_off_alt124

chat_bubble_outline6

repeat9

shareShare

Victoria Slocum

@victorialslocum

6 days ago

Optimizing your chunking techniques is one of the top places to improve performance in your RAG pipelines, but what’s the best one? Jina AI just released a new method called late chunking that takes the same amount of storage space as naive chunking, but solves the problem of

thumb_up_off_alt421

chat_bubble_outline7

repeat92

shareShare

Acid Boy 💊

@acid_boy__

6 days ago

GM! 🔊

thumb_up_off_alt709

chat_bubble_outline37

repeat103

shareShare

Google AI

@googleai

6 days ago

Introducing our new whale bioacoustics model, which can identify eight distinct species, including multiple calls for two of those species. The model even includes the “Biotwang” sounds recently attributed to the Bryde’s whale. Learn more at: goo.gle/3Znukdk

thumb_up_off_alt867

chat_bubble_outline34

repeat140

shareShare

Weaviate • vector database

@weaviate_io

6 days ago

Late chunking is revolutionizing the way Retrieval Augmented Generation (RAG) systems retrieve information. 🧩 In naive chunking: 1. We separate the original document into chunks (e.g. sentences) 2. Each chunk is independently embedded into token-level representations 3. These

thumb_up_off_alt11

chat_bubble_outline0

repeat4

shareShare

Jennifer Li

@jenniferhli

6 days ago

I’ve said this before and I’ll say it again. The #1, #2, #3 deciding factor of a startup’s success is the shipping velocity. Companies have no long term moat, the only moat is a fast shipping culture.

thumb_up_off_alt470

chat_bubble_outline24

repeat54

shareShare

★

@thematrixwizard

5 days ago

thumb_up_off_alt2,2K

chat_bubble_outline11

repeat349

shareShare

daniel phiri

@malgamves

5 days ago

incredibly excited to share that i will be speaking at the flagship ai event by dotConferences in paris next month. check it out! 👨🏾‍🌾 dotai.io

incredibly excited to share that i will be speaking at the flagship ai event by <a href="/dotConferences/">dotConferences</a> in paris next month.

check it out! 👨🏾‍🌾 dotai.io

thumb_up_off_alt8

chat_bubble_outline0

repeat1

shareShare

Weaviate • vector database

@weaviate_io

5 days ago

Did you know that 43% of users on retail websites go directly to the search bar and are 2-3x more likely to convert? Yet, 64% of retail website managers have no clear plan on how to improve their search experience. Find out how to make the most of AI search and nail your

thumb_up_off_alt22

chat_bubble_outline0

repeat7

shareShare

Mark Riedl

@mark_riedl

5 days ago

Making Large Language Models into World Models with Precondition and Effect Knowledge arxiv.org/abs/2409.12278 I'm an RL guy. To me, a world model maps (state, action) -> state' So let's make LLMs do that. Then we can build planning and reasoning algorithms on top of them.

thumb_up_off_alt201

chat_bubble_outline1

repeat37

shareShare

abhinav

@abnux

4 days ago

probably the most important essay you need to read as a designer, creator, founder for the upcoming era “in a world of scarcity, we treasure tools. in a world of abundance, we treasure taste.”

thumb_up_off_alt2,2K

chat_bubble_outline35

repeat310

shareShare

Weaviate • vector database

@weaviate_io

4 days ago

How do you get the most out of your Retrieval-Augmented Generation (RAG) apps? First crucial step is chunking! The process of breaking down large documents or texts into smaller, manageable pieces called ‘chunks’. This simple yet powerful pre-processing step is key to boosting

thumb_up_off_alt34

chat_bubble_outline2

repeat8

shareShare

Weaviate • vector database

@weaviate_io

4 days ago

3. Document-Based Chunking: This technique creates chunks based on the natural divisions within the document, such as headings or sections. It’s very effective for structured data like HTML, Markdown, or code files but it’s less useful when the data lacks clear structural

thumb_up_off_alt5

chat_bubble_outline3

repeat1

shareShare

Weaviate • vector database

@weaviate_io

4 days ago

4. Semantic Chunking In this technique, the text is divided into meaningful units, such as sentences or paragraphs, which are then vectorized. These units are then combined into chunks based on the cosine distance between their embeddings, with a new chunk formed whenever a

thumb_up_off_alt5

chat_bubble_outline1

repeat1

shareShare

meng shao

@shao__meng

4 days ago

RAG 中你需要知道的 5 种分块技术 Weaviate • vector database 的文章强调了分块在 RAG 应用中的重要性。对于提高 LLM 性能至关重要，能使 RAG 应用更智能、更快速、更高效。文中介绍了五种主要的分块技术： 01 - 固定大小分块： - 方法：将文本分割成固定大小的块，不考虑内容的自然断点或结构。 -

RAG 中你需要知道的 5 种分块技术

<a href="/weaviate_io/">Weaviate • vector database</a> 的文章强调了分块在 RAG 应用中的重要性。对于提高 LLM 性能至关重要，能使 RAG 应用更智能、更快速、更高效。

文中介绍了五种主要的分块技术：

01 - 固定大小分块：
- 方法：将文本分割成固定大小的块，不考虑内容的自然断点或结构。
-

thumb_up_off_alt79

chat_bubble_outline1

repeat25

shareShare

Weaviate • vector database

@weaviate_io

4 days ago

We introduced the WeaviateAsyncClient for Python! • Easy integration with local environments, Weaviate Cloud, and custom setups • Works seamlessly with FastAPI for modular web API microservices Read more about the release: weaviate.io/blog/weaviate-…

thumb_up_off_alt16

chat_bubble_outline0

repeat4

shareShare