daven3 commited on
Commit
296161e
1 Parent(s): 4eaa0e4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -0
README.md CHANGED
@@ -1,3 +1,26 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ datasets:
4
+ - daven3/geosignal
5
+ language:
6
+ - en
7
  ---
8
+
9
+ <div style="text-align:center">
10
+ <img src="https://big-cheng.com/k2/k2.png" alt="k2-logo" width="200"/>
11
+ <h2>Delta Model for Large Language Model for Geoscience</h2>
12
+ </div>
13
+
14
+ **Tips: Due to the fact that, the network issues, we are undergoing uploading the models🤦🏻**
15
+
16
+ ## Introduction
17
+
18
+ We introduce **K2** (7B), an open-source language model trained by firstly further pretraining LLaMA on collected and cleaned geoscience literature, including geoscience open-access papers and Wikipedia pages, and secondly fine-tuning with knowledge-intensive instruction tuning data (GeoSignal). As for preliminary evaluation, we use GeoBenchmark (consisting of NPEE and AP Test on Geology, Geography, and Environmental Science) as the benchmark. K2 outperforms the baselines on objective and subjective tasks compared to several baseline models with similar parameters.
19
+ We release K2 delta weights after further pretraining with the geoscience text corpus to comply with the LLaMA model license.
20
+
21
+ ***The following is the overview of training K2:***
22
+ ![overview](https://big-cheng.com/k2/overview.png)
23
+
24
+ ## How to Use
25
+
26
+ Please refer to [K2](https://github.com/davendw49/k2) Github repo for further usage.