Commit History
Merge branch 'main' of https://huggingface.co/THUDM/chatglm-6b-int4
6c5205c
duzx16
commited on
Update license
bb09de3
duzx16
commited on
Upload pytorch_model.bin
02a065c
Update slack link
e214c5b
Update decode method in tokenizer
d8a6cfc
duzx16
commited on
Add support for parallel quantization on Mac
f6b88da
duzx16
commited on
Remove assert in load_cpu_kernel
63d66b0
duzx16
commited on
Sync with chatglm-6b
f55a108
duzx16
commited on
Remove pytorch_model.bin.index.json
e02ba89
duzx16
commited on
Update slack link
6498797
duzx16
commited on
Add pytorch_model.bin.index.json
1e40d96
duzx16
commited on
Add assertion when loading cpu and cuda kernel fails
630d0ef
songxxzp
commited on
Add assertion when loading cpu and cuda kernel fails
bcc35f0
songxxzp
commited on
Merge branch 'dev'
fe0674f
songxxzp
commited on
Update CPU kernel loading method
c7d8998
songxxzp
commited on
Fix gmask
3485994
duzx16
commited on
Add empty_init option
9333486
duzx16
commited on
Update README.md
6466cdc
duzx16
commited on
Fix eos token in tokenizer
9163f7e
duzx16
commited on
Update dependency
649466f
duzx16
commited on
Fix attention score on mps
41fda88
duzx16
commited on
Fix logit processor
a7272d4
duzx16
commited on
Merge branch 'slim' of https://huggingface.co/THUDM/chatglm-6b-int4 into slim
96de7a2
duzx16
commited on
Fix embedding quantization
5fc46d2
duzx16
commited on
Upload pytorch_model.bin
7edbdfe
Slim embedding
bfb1a8f
duzx16
commited on
Fix bugs when compiling cpu kernels
68873da
Drop icetk dependency
1f34060
duzx16
commited on
Fix position ids expand
19685a5
duzx16
commited on
Synchronize with chatglm 6b repo
7aaf3fe
Fix parallel cpu kernel
7458231
Fix bugs in quantization when loading kernels
dac03c3
Fix Chinese punctuation
debaf00
duzx16
commited on
Update README.md
3ba9437
Update README.md
0d0e806
Update README.md
7ad727c
init commmit
a93efa9
initial commit
62a9758
Zhengxiao Du
commited on