Update README.md
Browse files
README.md
CHANGED
@@ -11,8 +11,9 @@ tags:
|
|
11 |
# Description
|
12 |
We use MS Marco Encoder msmarco-MiniLM-L-6-v3 to encode the text from dataset [abokbot/wikipedia-first-paragraph](https://huggingface.co/datasets/abokbot/wikipedia-first-paragraph).
|
13 |
|
14 |
-
|
15 |
|
|
|
16 |
|
17 |
# Code
|
18 |
It was obtained by running the following code.
|
|
|
11 |
# Description
|
12 |
We use MS Marco Encoder msmarco-MiniLM-L-6-v3 to encode the text from dataset [abokbot/wikipedia-first-paragraph](https://huggingface.co/datasets/abokbot/wikipedia-first-paragraph).
|
13 |
|
14 |
+
The dataset contains the first paragraphs of the English "20220301.en" version of the [Wikipedia dataset](https://huggingface.co/datasets/wikipedia).
|
15 |
|
16 |
+
The output is an embedding tensor of size [6458670, 384].
|
17 |
|
18 |
# Code
|
19 |
It was obtained by running the following code.
|