EC2 Default User
commited on
Commit
•
5398b80
1
Parent(s):
f799402
Update spaCy pipeline
Browse files- .gitattributes +1 -0
- LICENSES_SOURCES +0 -551
- README.md +58 -44
- accuracy.json +176 -176
- attribute_ruler/patterns +0 -0
- config.cfg +35 -15
- da_core_news_lg-any-py3-none-any.whl +2 -2
- lemmatizer/cfg +457 -0
- lemmatizer/{lookups/lookups.bin → model} +2 -2
- lemmatizer/trees +0 -0
- meta.json +182 -195
- morphologizer/model +2 -2
- ner/model +2 -2
- parser/model +1 -1
- parser/moves +1 -1
- senter/model +2 -2
- tok2vec/model +2 -2
- tokenizer +0 -0
- vocab/key2row +0 -0
- vocab/strings.json +2 -2
.gitattributes
CHANGED
@@ -19,3 +19,4 @@
|
|
19 |
*strings.json filter=lfs diff=lfs merge=lfs -text
|
20 |
vectors filter=lfs diff=lfs merge=lfs -text
|
21 |
model filter=lfs diff=lfs merge=lfs -text
|
|
|
|
19 |
*strings.json filter=lfs diff=lfs merge=lfs -text
|
20 |
vectors filter=lfs diff=lfs merge=lfs -text
|
21 |
model filter=lfs diff=lfs merge=lfs -text
|
22 |
+
*key2row filter=lfs diff=lfs merge=lfs -text
|
LICENSES_SOURCES
CHANGED
@@ -878,557 +878,6 @@ Creative Commons may be contacted at creativecommons.org.
|
|
878 |
|
879 |
|
880 |
|
881 |
-
# Lemmatization Lists
|
882 |
-
|
883 |
-
* Author: Michal Měchura
|
884 |
-
* URL: https://github.com/michmech/lemmatization-lists/
|
885 |
-
* License: ODbL
|
886 |
-
|
887 |
-
```
|
888 |
-
## ODC Open Database License (ODbL)
|
889 |
-
|
890 |
-
### Preamble
|
891 |
-
|
892 |
-
The Open Database License (ODbL) is a license agreement intended to
|
893 |
-
allow users to freely share, modify, and use this Database while
|
894 |
-
maintaining this same freedom for others. Many databases are covered by
|
895 |
-
copyright, and therefore this document licenses these rights. Some
|
896 |
-
jurisdictions, mainly in the European Union, have specific rights that
|
897 |
-
cover databases, and so the ODbL addresses these rights, too. Finally,
|
898 |
-
the ODbL is also an agreement in contract for users of this Database to
|
899 |
-
act in certain ways in return for accessing this Database.
|
900 |
-
|
901 |
-
Databases can contain a wide variety of types of content (images,
|
902 |
-
audiovisual material, and sounds all in the same database, for example),
|
903 |
-
and so the ODbL only governs the rights over the Database, and not the
|
904 |
-
contents of the Database individually. Licensors should use the ODbL
|
905 |
-
together with another license for the contents, if the contents have a
|
906 |
-
single set of rights that uniformly covers all of the contents. If the
|
907 |
-
contents have multiple sets of different rights, Licensors should
|
908 |
-
describe what rights govern what contents together in the individual
|
909 |
-
record or in some other way that clarifies what rights apply.
|
910 |
-
|
911 |
-
Sometimes the contents of a database, or the database itself, can be
|
912 |
-
covered by other rights not addressed here (such as private contracts,
|
913 |
-
trade mark over the name, or privacy rights / data protection rights
|
914 |
-
over information in the contents), and so you are advised that you may
|
915 |
-
have to consult other documents or clear other rights before doing
|
916 |
-
activities not covered by this License.
|
917 |
-
|
918 |
-
------
|
919 |
-
|
920 |
-
The Licensor (as defined below)
|
921 |
-
|
922 |
-
and
|
923 |
-
|
924 |
-
You (as defined below)
|
925 |
-
|
926 |
-
agree as follows:
|
927 |
-
|
928 |
-
### 1.0 Definitions of Capitalised Words
|
929 |
-
|
930 |
-
"Collective Database" – Means this Database in unmodified form as part
|
931 |
-
of a collection of independent databases in themselves that together are
|
932 |
-
assembled into a collective whole. A work that constitutes a Collective
|
933 |
-
Database will not be considered a Derivative Database.
|
934 |
-
|
935 |
-
"Convey" – As a verb, means Using the Database, a Derivative Database,
|
936 |
-
or the Database as part of a Collective Database in any way that enables
|
937 |
-
a Person to make or receive copies of the Database or a Derivative
|
938 |
-
Database. Conveying does not include interaction with a user through a
|
939 |
-
computer network, or creating and Using a Produced Work, where no
|
940 |
-
transfer of a copy of the Database or a Derivative Database occurs.
|
941 |
-
"Contents" – The contents of this Database, which includes the
|
942 |
-
information, independent works, or other material collected into the
|
943 |
-
Database. For example, the contents of the Database could be factual
|
944 |
-
data or works such as images, audiovisual material, text, or sounds.
|
945 |
-
|
946 |
-
"Database" – A collection of material (the Contents) arranged in a
|
947 |
-
systematic or methodical way and individually accessible by electronic
|
948 |
-
or other means offered under the terms of this License.
|
949 |
-
|
950 |
-
"Database Directive" – Means Directive 96/9/EC of the European
|
951 |
-
Parliament and of the Council of 11 March 1996 on the legal protection
|
952 |
-
of databases, as amended or succeeded.
|
953 |
-
|
954 |
-
"Database Right" – Means rights resulting from the Chapter III ("sui
|
955 |
-
generis") rights in the Database Directive (as amended and as transposed
|
956 |
-
by member states), which includes the Extraction and Re-utilisation of
|
957 |
-
the whole or a Substantial part of the Contents, as well as any similar
|
958 |
-
rights available in the relevant jurisdiction under Section 10.4.
|
959 |
-
|
960 |
-
"Derivative Database" – Means a database based upon the Database, and
|
961 |
-
includes any translation, adaptation, arrangement, modification, or any
|
962 |
-
other alteration of the Database or of a Substantial part of the
|
963 |
-
Contents. This includes, but is not limited to, Extracting or
|
964 |
-
Re-utilising the whole or a Substantial part of the Contents in a new
|
965 |
-
Database.
|
966 |
-
|
967 |
-
"Extraction" – Means the permanent or temporary transfer of all or a
|
968 |
-
Substantial part of the Contents to another medium by any means or in
|
969 |
-
any form.
|
970 |
-
|
971 |
-
"License" – Means this license agreement and is both a license of rights
|
972 |
-
such as copyright and Database Rights and an agreement in contract.
|
973 |
-
|
974 |
-
"Licensor" – Means the Person that offers the Database under the terms
|
975 |
-
of this License.
|
976 |
-
|
977 |
-
"Person" – Means a natural or legal person or a body of persons
|
978 |
-
corporate or incorporate.
|
979 |
-
|
980 |
-
"Produced Work" – a work (such as an image, audiovisual material, text,
|
981 |
-
or sounds) resulting from using the whole or a Substantial part of the
|
982 |
-
Contents (via a search or other query) from this Database, a Derivative
|
983 |
-
Database, or this Database as part of a Collective Database.
|
984 |
-
|
985 |
-
"Publicly" – means to Persons other than You or under Your control by
|
986 |
-
either more than 50% ownership or by the power to direct their
|
987 |
-
activities (such as contracting with an independent consultant).
|
988 |
-
|
989 |
-
"Re-utilisation" – means any form of making available to the public all
|
990 |
-
or a Substantial part of the Contents by the distribution of copies, by
|
991 |
-
renting, by online or other forms of transmission.
|
992 |
-
|
993 |
-
"Substantial" – Means substantial in terms of quantity or quality or a
|
994 |
-
combination of both. The repeated and systematic Extraction or
|
995 |
-
Re-utilisation of insubstantial parts of the Contents may amount to the
|
996 |
-
Extraction or Re-utilisation of a Substantial part of the Contents.
|
997 |
-
|
998 |
-
"Use" – As a verb, means doing any act that is restricted by copyright
|
999 |
-
or Database Rights whether in the original medium or any other; and
|
1000 |
-
includes without limitation distributing, copying, publicly performing,
|
1001 |
-
publicly displaying, and preparing derivative works of the Database, as
|
1002 |
-
well as modifying the Database as may be technically necessary to use it
|
1003 |
-
in a different mode or format.
|
1004 |
-
|
1005 |
-
"You" – Means a Person exercising rights under this License who has not
|
1006 |
-
previously violated the terms of this License with respect to the
|
1007 |
-
Database, or who has received express permission from the Licensor to
|
1008 |
-
exercise rights under this License despite a previous violation.
|
1009 |
-
|
1010 |
-
Words in the singular include the plural and vice versa.
|
1011 |
-
|
1012 |
-
### 2.0 What this License covers
|
1013 |
-
|
1014 |
-
2.1. Legal effect of this document. This License is:
|
1015 |
-
|
1016 |
-
a. A license of applicable copyright and neighbouring rights;
|
1017 |
-
|
1018 |
-
b. A license of the Database Right; and
|
1019 |
-
|
1020 |
-
c. An agreement in contract between You and the Licensor.
|
1021 |
-
|
1022 |
-
2.2 Legal rights covered. This License covers the legal rights in the
|
1023 |
-
Database, including:
|
1024 |
-
|
1025 |
-
a. Copyright. Any copyright or neighbouring rights in the Database.
|
1026 |
-
The copyright licensed includes any individual elements of the
|
1027 |
-
Database, but does not cover the copyright over the Contents
|
1028 |
-
independent of this Database. See Section 2.4 for details. Copyright
|
1029 |
-
law varies between jurisdictions, but is likely to cover: the Database
|
1030 |
-
model or schema, which is the structure, arrangement, and organisation
|
1031 |
-
of the Database, and can also include the Database tables and table
|
1032 |
-
indexes; the data entry and output sheets; and the Field names of
|
1033 |
-
Contents stored in the Database;
|
1034 |
-
|
1035 |
-
b. Database Rights. Database Rights only extend to the Extraction and
|
1036 |
-
Re-utilisation of the whole or a Substantial part of the Contents.
|
1037 |
-
Database Rights can apply even when there is no copyright over the
|
1038 |
-
Database. Database Rights can also apply when the Contents are removed
|
1039 |
-
from the Database and are selected and arranged in a way that would
|
1040 |
-
not infringe any applicable copyright; and
|
1041 |
-
|
1042 |
-
c. Contract. This is an agreement between You and the Licensor for
|
1043 |
-
access to the Database. In return you agree to certain conditions of
|
1044 |
-
use on this access as outlined in this License.
|
1045 |
-
|
1046 |
-
2.3 Rights not covered.
|
1047 |
-
|
1048 |
-
a. This License does not apply to computer programs used in the making
|
1049 |
-
or operation of the Database;
|
1050 |
-
|
1051 |
-
b. This License does not cover any patents over the Contents or the
|
1052 |
-
Database; and
|
1053 |
-
|
1054 |
-
c. This License does not cover any trademarks associated with the
|
1055 |
-
Database.
|
1056 |
-
|
1057 |
-
2.4 Relationship to Contents in the Database. The individual items of
|
1058 |
-
the Contents contained in this Database may be covered by other rights,
|
1059 |
-
including copyright, patent, data protection, privacy, or personality
|
1060 |
-
rights, and this License does not cover any rights (other than Database
|
1061 |
-
Rights or in contract) in individual Contents contained in the Database.
|
1062 |
-
For example, if used on a Database of images (the Contents), this
|
1063 |
-
License would not apply to copyright over individual images, which could
|
1064 |
-
have their own separate licenses, or one single license covering all of
|
1065 |
-
the rights over the images.
|
1066 |
-
|
1067 |
-
### 3.0 Rights granted
|
1068 |
-
|
1069 |
-
3.1 Subject to the terms and conditions of this License, the Licensor
|
1070 |
-
grants to You a worldwide, royalty-free, non-exclusive, terminable (but
|
1071 |
-
only under Section 9) license to Use the Database for the duration of
|
1072 |
-
any applicable copyright and Database Rights. These rights explicitly
|
1073 |
-
include commercial use, and do not exclude any field of endeavour. To
|
1074 |
-
the extent possible in the relevant jurisdiction, these rights may be
|
1075 |
-
exercised in all media and formats whether now known or created in the
|
1076 |
-
future.
|
1077 |
-
|
1078 |
-
The rights granted cover, for example:
|
1079 |
-
|
1080 |
-
a. Extraction and Re-utilisation of the whole or a Substantial part of
|
1081 |
-
the Contents;
|
1082 |
-
|
1083 |
-
b. Creation of Derivative Databases;
|
1084 |
-
|
1085 |
-
c. Creation of Collective Databases;
|
1086 |
-
|
1087 |
-
d. Creation of temporary or permanent reproductions by any means and
|
1088 |
-
in any form, in whole or in part, including of any Derivative
|
1089 |
-
Databases or as a part of Collective Databases; and
|
1090 |
-
|
1091 |
-
e. Distribution, communication, display, lending, making available, or
|
1092 |
-
performance to the public by any means and in any form, in whole or in
|
1093 |
-
part, including of any Derivative Database or as a part of Collective
|
1094 |
-
Databases.
|
1095 |
-
|
1096 |
-
3.2 Compulsory license schemes. For the avoidance of doubt:
|
1097 |
-
|
1098 |
-
a. Non-waivable compulsory license schemes. In those jurisdictions in
|
1099 |
-
which the right to collect royalties through any statutory or
|
1100 |
-
compulsory licensing scheme cannot be waived, the Licensor reserves
|
1101 |
-
the exclusive right to collect such royalties for any exercise by You
|
1102 |
-
of the rights granted under this License;
|
1103 |
-
|
1104 |
-
b. Waivable compulsory license schemes. In those jurisdictions in
|
1105 |
-
which the right to collect royalties through any statutory or
|
1106 |
-
compulsory licensing scheme can be waived, the Licensor waives the
|
1107 |
-
exclusive right to collect such royalties for any exercise by You of
|
1108 |
-
the rights granted under this License; and,
|
1109 |
-
|
1110 |
-
c. Voluntary license schemes. The Licensor waives the right to collect
|
1111 |
-
royalties, whether individually or, in the event that the Licensor is
|
1112 |
-
a member of a collecting society that administers voluntary licensing
|
1113 |
-
schemes, via that society, from any exercise by You of the rights
|
1114 |
-
granted under this License.
|
1115 |
-
|
1116 |
-
3.3 The right to release the Database under different terms, or to stop
|
1117 |
-
distributing or making available the Database, is reserved. Note that
|
1118 |
-
this Database may be multiple-licensed, and so You may have the choice
|
1119 |
-
of using alternative licenses for this Database. Subject to Section
|
1120 |
-
10.4, all other rights not expressly granted by Licensor are reserved.
|
1121 |
-
|
1122 |
-
### 4.0 Conditions of Use
|
1123 |
-
|
1124 |
-
4.1 The rights granted in Section 3 above are expressly made subject to
|
1125 |
-
Your complying with the following conditions of use. These are important
|
1126 |
-
conditions of this License, and if You fail to follow them, You will be
|
1127 |
-
in material breach of its terms.
|
1128 |
-
|
1129 |
-
4.2 Notices. If You Publicly Convey this Database, any Derivative
|
1130 |
-
Database, or the Database as part of a Collective Database, then You
|
1131 |
-
must:
|
1132 |
-
|
1133 |
-
a. Do so only under the terms of this License or another license
|
1134 |
-
permitted under Section 4.4;
|
1135 |
-
|
1136 |
-
b. Include a copy of this License (or, as applicable, a license
|
1137 |
-
permitted under Section 4.4) or its Uniform Resource Identifier (URI)
|
1138 |
-
with the Database or Derivative Database, including both in the
|
1139 |
-
Database or Derivative Database and in any relevant documentation; and
|
1140 |
-
|
1141 |
-
c. Keep intact any copyright or Database Right notices and notices
|
1142 |
-
that refer to this License.
|
1143 |
-
|
1144 |
-
d. If it is not possible to put the required notices in a particular
|
1145 |
-
file due to its structure, then You must include the notices in a
|
1146 |
-
location (such as a relevant directory) where users would be likely to
|
1147 |
-
look for it.
|
1148 |
-
|
1149 |
-
4.3 Notice for using output (Contents). Creating and Using a Produced
|
1150 |
-
Work does not require the notice in Section 4.2. However, if you
|
1151 |
-
Publicly Use a Produced Work, You must include a notice associated with
|
1152 |
-
the Produced Work reasonably calculated to make any Person that uses,
|
1153 |
-
views, accesses, interacts with, or is otherwise exposed to the Produced
|
1154 |
-
Work aware that Content was obtained from the Database, Derivative
|
1155 |
-
Database, or the Database as part of a Collective Database, and that it
|
1156 |
-
is available under this License.
|
1157 |
-
|
1158 |
-
a. Example notice. The following text will satisfy notice under
|
1159 |
-
Section 4.3:
|
1160 |
-
|
1161 |
-
Contains information from DATABASE NAME, which is made available
|
1162 |
-
here under the Open Database License (ODbL).
|
1163 |
-
|
1164 |
-
DATABASE NAME should be replaced with the name of the Database and a
|
1165 |
-
hyperlink to the URI of the Database. "Open Database License" should
|
1166 |
-
contain a hyperlink to the URI of the text of this License. If
|
1167 |
-
hyperlinks are not possible, You should include the plain text of the
|
1168 |
-
required URI's with the above notice.
|
1169 |
-
|
1170 |
-
4.4 Share alike.
|
1171 |
-
|
1172 |
-
a. Any Derivative Database that You Publicly Use must be only under
|
1173 |
-
the terms of:
|
1174 |
-
|
1175 |
-
i. This License;
|
1176 |
-
|
1177 |
-
ii. A later version of this License similar in spirit to this
|
1178 |
-
License; or
|
1179 |
-
|
1180 |
-
iii. A compatible license.
|
1181 |
-
|
1182 |
-
If You license the Derivative Database under one of the licenses
|
1183 |
-
mentioned in (iii), You must comply with the terms of that license.
|
1184 |
-
|
1185 |
-
b. For the avoidance of doubt, Extraction or Re-utilisation of the
|
1186 |
-
whole or a Substantial part of the Contents into a new database is a
|
1187 |
-
Derivative Database and must comply with Section 4.4.
|
1188 |
-
|
1189 |
-
c. Derivative Databases and Produced Works. A Derivative Database is
|
1190 |
-
Publicly Used and so must comply with Section 4.4. if a Produced Work
|
1191 |
-
created from the Derivative Database is Publicly Used.
|
1192 |
-
|
1193 |
-
d. Share Alike and additional Contents. For the avoidance of doubt,
|
1194 |
-
You must not add Contents to Derivative Databases under Section 4.4 a
|
1195 |
-
that are incompatible with the rights granted under this License.
|
1196 |
-
|
1197 |
-
e. Compatible licenses. Licensors may authorise a proxy to determine
|
1198 |
-
compatible licenses under Section 4.4 a iii. If they do so, the
|
1199 |
-
authorised proxy's public statement of acceptance of a compatible
|
1200 |
-
license grants You permission to use the compatible license.
|
1201 |
-
|
1202 |
-
|
1203 |
-
4.5 Limits of Share Alike. The requirements of Section 4.4 do not apply
|
1204 |
-
in the following:
|
1205 |
-
|
1206 |
-
a. For the avoidance of doubt, You are not required to license
|
1207 |
-
Collective Databases under this License if You incorporate this
|
1208 |
-
Database or a Derivative Database in the collection, but this License
|
1209 |
-
still applies to this Database or a Derivative Database as a part of
|
1210 |
-
the Collective Database;
|
1211 |
-
|
1212 |
-
b. Using this Database, a Derivative Database, or this Database as
|
1213 |
-
part of a Collective Database to create a Produced Work does not
|
1214 |
-
create a Derivative Database for purposes of Section 4.4; and
|
1215 |
-
|
1216 |
-
c. Use of a Derivative Database internally within an organisation is
|
1217 |
-
not to the public and therefore does not fall under the requirements
|
1218 |
-
of Section 4.4.
|
1219 |
-
|
1220 |
-
4.6 Access to Derivative Databases. If You Publicly Use a Derivative
|
1221 |
-
Database or a Produced Work from a Derivative Database, You must also
|
1222 |
-
offer to recipients of the Derivative Database or Produced Work a copy
|
1223 |
-
in a machine readable form of:
|
1224 |
-
|
1225 |
-
a. The entire Derivative Database; or
|
1226 |
-
|
1227 |
-
b. A file containing all of the alterations made to the Database or
|
1228 |
-
the method of making the alterations to the Database (such as an
|
1229 |
-
algorithm), including any additional Contents, that make up all the
|
1230 |
-
differences between the Database and the Derivative Database.
|
1231 |
-
|
1232 |
-
The Derivative Database (under a.) or alteration file (under b.) must be
|
1233 |
-
available at no more than a reasonable production cost for physical
|
1234 |
-
distributions and free of charge if distributed over the internet.
|
1235 |
-
|
1236 |
-
4.7 Technological measures and additional terms
|
1237 |
-
|
1238 |
-
a. This License does not allow You to impose (except subject to
|
1239 |
-
Section 4.7 b.) any terms or any technological measures on the
|
1240 |
-
Database, a Derivative Database, or the whole or a Substantial part of
|
1241 |
-
the Contents that alter or restrict the terms of this License, or any
|
1242 |
-
rights granted under it, or have the effect or intent of restricting
|
1243 |
-
the ability of any person to exercise those rights.
|
1244 |
-
|
1245 |
-
b. Parallel distribution. You may impose terms or technological
|
1246 |
-
measures on the Database, a Derivative Database, or the whole or a
|
1247 |
-
Substantial part of the Contents (a "Restricted Database") in
|
1248 |
-
contravention of Section 4.74 a. only if You also make a copy of the
|
1249 |
-
Database or a Derivative Database available to the recipient of the
|
1250 |
-
Restricted Database:
|
1251 |
-
|
1252 |
-
i. That is available without additional fee;
|
1253 |
-
|
1254 |
-
ii. That is available in a medium that does not alter or restrict
|
1255 |
-
the terms of this License, or any rights granted under it, or have
|
1256 |
-
the effect or intent of restricting the ability of any person to
|
1257 |
-
exercise those rights (an "Unrestricted Database"); and
|
1258 |
-
|
1259 |
-
iii. The Unrestricted Database is at least as accessible to the
|
1260 |
-
recipient as a practical matter as the Restricted Database.
|
1261 |
-
|
1262 |
-
c. For the avoidance of doubt, You may place this Database or a
|
1263 |
-
Derivative Database in an authenticated environment, behind a
|
1264 |
-
password, or within a similar access control scheme provided that You
|
1265 |
-
do not alter or restrict the terms of this License or any rights
|
1266 |
-
granted under it or have the effect or intent of restricting the
|
1267 |
-
ability of any person to exercise those rights.
|
1268 |
-
|
1269 |
-
4.8 Licensing of others. You may not sublicense the Database. Each time
|
1270 |
-
You communicate the Database, the whole or Substantial part of the
|
1271 |
-
Contents, or any Derivative Database to anyone else in any way, the
|
1272 |
-
Licensor offers to the recipient a license to the Database on the same
|
1273 |
-
terms and conditions as this License. You are not responsible for
|
1274 |
-
enforcing compliance by third parties with this License, but You may
|
1275 |
-
enforce any rights that You have over a Derivative Database. You are
|
1276 |
-
solely responsible for any modifications of a Derivative Database made
|
1277 |
-
by You or another Person at Your direction. You may not impose any
|
1278 |
-
further restrictions on the exercise of the rights granted or affirmed
|
1279 |
-
under this License.
|
1280 |
-
|
1281 |
-
### 5.0 Moral rights
|
1282 |
-
|
1283 |
-
5.1 Moral rights. This section covers moral rights, including any rights
|
1284 |
-
to be identified as the author of the Database or to object to treatment
|
1285 |
-
that would otherwise prejudice the author's honour and reputation, or
|
1286 |
-
any other derogatory treatment:
|
1287 |
-
|
1288 |
-
a. For jurisdictions allowing waiver of moral rights, Licensor waives
|
1289 |
-
all moral rights that Licensor may have in the Database to the fullest
|
1290 |
-
extent possible by the law of the relevant jurisdiction under Section
|
1291 |
-
10.4;
|
1292 |
-
|
1293 |
-
b. If waiver of moral rights under Section 5.1 a in the relevant
|
1294 |
-
jurisdiction is not possible, Licensor agrees not to assert any moral
|
1295 |
-
rights over the Database and waives all claims in moral rights to the
|
1296 |
-
fullest extent possible by the law of the relevant jurisdiction under
|
1297 |
-
Section 10.4; and
|
1298 |
-
|
1299 |
-
c. For jurisdictions not allowing waiver or an agreement not to assert
|
1300 |
-
moral rights under Section 5.1 a and b, the author may retain their
|
1301 |
-
moral rights over certain aspects of the Database.
|
1302 |
-
|
1303 |
-
Please note that some jurisdictions do not allow for the waiver of moral
|
1304 |
-
rights, and so moral rights may still subsist over the Database in some
|
1305 |
-
jurisdictions.
|
1306 |
-
|
1307 |
-
### 6.0 Fair dealing, Database exceptions, and other rights not affected
|
1308 |
-
|
1309 |
-
6.1 This License does not affect any rights that You or anyone else may
|
1310 |
-
independently have under any applicable law to make any use of this
|
1311 |
-
Database, including without limitation:
|
1312 |
-
|
1313 |
-
a. Exceptions to the Database Right including: Extraction of Contents
|
1314 |
-
from non-electronic Databases for private purposes, Extraction for
|
1315 |
-
purposes of illustration for teaching or scientific research, and
|
1316 |
-
Extraction or Re-utilisation for public security or an administrative
|
1317 |
-
or judicial procedure.
|
1318 |
-
|
1319 |
-
b. Fair dealing, fair use, or any other legally recognised limitation
|
1320 |
-
or exception to infringement of copyright or other applicable laws.
|
1321 |
-
|
1322 |
-
6.2 This License does not affect any rights of lawful users to Extract
|
1323 |
-
and Re-utilise insubstantial parts of the Contents, evaluated
|
1324 |
-
quantitatively or qualitatively, for any purposes whatsoever, including
|
1325 |
-
creating a Derivative Database (subject to other rights over the
|
1326 |
-
Contents, see Section 2.4). The repeated and systematic Extraction or
|
1327 |
-
Re-utilisation of insubstantial parts of the Contents may however amount
|
1328 |
-
to the Extraction or Re-utilisation of a Substantial part of the
|
1329 |
-
Contents.
|
1330 |
-
|
1331 |
-
### 7.0 Warranties and Disclaimer
|
1332 |
-
|
1333 |
-
7.1 The Database is licensed by the Licensor "as is" and without any
|
1334 |
-
warranty of any kind, either express, implied, or arising by statute,
|
1335 |
-
custom, course of dealing, or trade usage. Licensor specifically
|
1336 |
-
disclaims any and all implied warranties or conditions of title,
|
1337 |
-
non-infringement, accuracy or completeness, the presence or absence of
|
1338 |
-
errors, fitness for a particular purpose, merchantability, or otherwise.
|
1339 |
-
Some jurisdictions do not allow the exclusion of implied warranties, so
|
1340 |
-
this exclusion may not apply to You.
|
1341 |
-
|
1342 |
-
### 8.0 Limitation of liability
|
1343 |
-
|
1344 |
-
8.1 Subject to any liability that may not be excluded or limited by law,
|
1345 |
-
the Licensor is not liable for, and expressly excludes, all liability
|
1346 |
-
for loss or damage however and whenever caused to anyone by any use
|
1347 |
-
under this License, whether by You or by anyone else, and whether caused
|
1348 |
-
by any fault on the part of the Licensor or not. This exclusion of
|
1349 |
-
liability includes, but is not limited to, any special, incidental,
|
1350 |
-
consequential, punitive, or exemplary damages such as loss of revenue,
|
1351 |
-
data, anticipated profits, and lost business. This exclusion applies
|
1352 |
-
even if the Licensor has been advised of the possibility of such
|
1353 |
-
damages.
|
1354 |
-
|
1355 |
-
8.2 If liability may not be excluded by law, it is limited to actual and
|
1356 |
-
direct financial loss to the extent it is caused by proved negligence on
|
1357 |
-
the part of the Licensor.
|
1358 |
-
|
1359 |
-
### 9.0 Termination of Your rights under this License
|
1360 |
-
|
1361 |
-
9.1 Any breach by You of the terms and conditions of this License
|
1362 |
-
automatically terminates this License with immediate effect and without
|
1363 |
-
notice to You. For the avoidance of doubt, Persons who have received the
|
1364 |
-
Database, the whole or a Substantial part of the Contents, Derivative
|
1365 |
-
Databases, or the Database as part of a Collective Database from You
|
1366 |
-
under this License will not have their licenses terminated provided
|
1367 |
-
their use is in full compliance with this License or a license granted
|
1368 |
-
under Section 4.8 of this License. Sections 1, 2, 7, 8, 9 and 10 will
|
1369 |
-
survive any termination of this License.
|
1370 |
-
|
1371 |
-
9.2 If You are not in breach of the terms of this License, the Licensor
|
1372 |
-
will not terminate Your rights under it.
|
1373 |
-
|
1374 |
-
9.3 Unless terminated under Section 9.1, this License is granted to You
|
1375 |
-
for the duration of applicable rights in the Database.
|
1376 |
-
|
1377 |
-
9.4 Reinstatement of rights. If you cease any breach of the terms and
|
1378 |
-
conditions of this License, then your full rights under this License
|
1379 |
-
will be reinstated:
|
1380 |
-
|
1381 |
-
a. Provisionally and subject to permanent termination until the 60th
|
1382 |
-
day after cessation of breach;
|
1383 |
-
|
1384 |
-
b. Permanently on the 60th day after cessation of breach unless
|
1385 |
-
otherwise reasonably notified by the Licensor; or
|
1386 |
-
|
1387 |
-
c. Permanently if reasonably notified by the Licensor of the
|
1388 |
-
violation, this is the first time You have received notice of
|
1389 |
-
violation of this License from the Licensor, and You cure the
|
1390 |
-
violation prior to 30 days after your receipt of the notice.
|
1391 |
-
|
1392 |
-
Persons subject to permanent termination of rights are not eligible to
|
1393 |
-
be a recipient and receive a license under Section 4.8.
|
1394 |
-
|
1395 |
-
9.5 Notwithstanding the above, Licensor reserves the right to release
|
1396 |
-
the Database under different license terms or to stop distributing or
|
1397 |
-
making available the Database. Releasing the Database under different
|
1398 |
-
license terms or stopping the distribution of the Database will not
|
1399 |
-
withdraw this License (or any other license that has been, or is
|
1400 |
-
required to be, granted under the terms of this License), and this
|
1401 |
-
License will continue in full force and effect unless terminated as
|
1402 |
-
stated above.
|
1403 |
-
|
1404 |
-
### 10.0 General
|
1405 |
-
|
1406 |
-
10.1 If any provision of this License is held to be invalid or
|
1407 |
-
unenforceable, that must not affect the validity or enforceability of
|
1408 |
-
the remainder of the terms and conditions of this License and each
|
1409 |
-
remaining provision of this License shall be valid and enforced to the
|
1410 |
-
fullest extent permitted by law.
|
1411 |
-
|
1412 |
-
10.2 This License is the entire agreement between the parties with
|
1413 |
-
respect to the rights granted here over the Database. It replaces any
|
1414 |
-
earlier understandings, agreements or representations with respect to
|
1415 |
-
the Database.
|
1416 |
-
|
1417 |
-
10.3 If You are in breach of the terms of this License, You will not be
|
1418 |
-
entitled to rely on the terms of this License or to complain of any
|
1419 |
-
breach by the Licensor.
|
1420 |
-
|
1421 |
-
10.4 Choice of law. This License takes effect in and will be governed by
|
1422 |
-
the laws of the relevant jurisdiction in which the License terms are
|
1423 |
-
sought to be enforced. If the standard suite of rights granted under
|
1424 |
-
applicable copyright law and Database Rights in the relevant
|
1425 |
-
jurisdiction includes additional rights not granted under this License,
|
1426 |
-
these additional rights are granted in this License in order to meet the
|
1427 |
-
terms of this License.```
|
1428 |
-
|
1429 |
-
|
1430 |
-
|
1431 |
-
|
1432 |
# Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia)
|
1433 |
|
1434 |
* Author: Explosion
|
|
|
878 |
|
879 |
|
880 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
881 |
# Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia)
|
882 |
|
883 |
* Author: Explosion
|
README.md
CHANGED
@@ -14,61 +14,76 @@ model-index:
|
|
14 |
metrics:
|
15 |
- name: NER Precision
|
16 |
type: precision
|
17 |
-
value: 0.
|
18 |
- name: NER Recall
|
19 |
type: recall
|
20 |
-
value: 0.
|
21 |
- name: NER F Score
|
22 |
type: f_score
|
23 |
-
value: 0.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
24 |
- task:
|
25 |
name: POS
|
26 |
type: token-classification
|
27 |
metrics:
|
28 |
-
- name: POS Accuracy
|
29 |
type: accuracy
|
30 |
-
value: 0.
|
31 |
- task:
|
32 |
-
name:
|
33 |
type: token-classification
|
34 |
metrics:
|
35 |
-
- name:
|
36 |
-
type:
|
37 |
-
value: 0.
|
38 |
-
- name: SENTER Recall
|
39 |
-
type: recall
|
40 |
-
value: 0.8882978723
|
41 |
-
- name: SENTER F Score
|
42 |
-
type: f_score
|
43 |
-
value: 0.9010791367
|
44 |
- task:
|
45 |
-
name:
|
46 |
type: token-classification
|
47 |
metrics:
|
48 |
-
- name:
|
49 |
type: accuracy
|
50 |
-
value: 0.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
51 |
- task:
|
52 |
name: LABELED_DEPENDENCIES
|
53 |
type: token-classification
|
54 |
metrics:
|
55 |
-
- name: Labeled
|
56 |
-
type:
|
57 |
-
value: 0.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
58 |
---
|
59 |
### Details: https://spacy.io/models/da#da_core_news_lg
|
60 |
|
61 |
-
Danish pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler
|
62 |
|
63 |
| Feature | Description |
|
64 |
| --- | --- |
|
65 |
| **Name** | `da_core_news_lg` |
|
66 |
-
| **Version** | `3.
|
67 |
-
| **spaCy** | `>=3.
|
68 |
-
| **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `
|
69 |
-
| **Components** | `tok2vec`, `morphologizer`, `parser`, `
|
70 |
| **Vectors** | 500000 keys, 500000 unique vectors (300 dimensions) |
|
71 |
-
| **Sources** | [UD Danish DDT v2.8](https://github.com/UniversalDependencies/UD_Danish-DDT) (Johannsen, Anders; Martínez Alonso, Héctor; Plank, Barbara)<br />[DaNE](https://github.com/alexandrainst/danlp/blob/master/docs/datasets.md#danish-dependency-treebank-dane) (Rasmus Hvingelby, Amalie B. Pauli, Maria Barrett, Christina Rosted, Lasse M. Lidegaard, Anders Søgaard)<br />[
|
72 |
| **License** | `CC BY-SA 4.0` |
|
73 |
| **Author** | [Explosion](https://explosion.ai) |
|
74 |
|
@@ -76,13 +91,12 @@ Danish pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, s
|
|
76 |
|
77 |
<details>
|
78 |
|
79 |
-
<summary>View label scheme (
|
80 |
|
81 |
| Component | Labels |
|
82 |
| --- | --- |
|
83 |
| **`morphologizer`** | `AdpType=Prep\|POS=ADP`, `Definite=Ind\|Gender=Com\|Number=Sing\|POS=NOUN`, `Mood=Ind\|POS=AUX\|Tense=Pres\|VerbForm=Fin\|Voice=Act`, `POS=PROPN`, `Definite=Ind\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Def\|Gender=Neut\|Number=Sing\|POS=NOUN`, `POS=SCONJ`, `Definite=Def\|Gender=Com\|Number=Sing\|POS=NOUN`, `Mood=Ind\|POS=VERB\|Tense=Pres\|VerbForm=Fin\|Voice=Act`, `POS=ADV`, `Number=Plur\|POS=DET\|PronType=Dem`, `Degree=Pos\|Number=Plur\|POS=ADJ`, `Definite=Ind\|Gender=Com\|Number=Plur\|POS=NOUN`, `POS=PUNCT`, `POS=CCONJ`, `Definite=Ind\|Degree=Cmp\|Number=Sing\|POS=ADJ`, `Degree=Cmp\|POS=ADJ`, `POS=PRON\|PartType=Inf`, `Gender=Com\|Number=Sing\|POS=DET\|PronType=Ind`, `Definite=Ind\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Case=Acc\|Gender=Neut\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Definite=Ind\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Definite=Def\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Gender=Neut\|Number=Sing\|POS=DET\|PronType=Dem`, `Degree=Pos\|POS=ADV`, `Definite=Def\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Ind\|Gender=Neut\|Number=Sing\|POS=NOUN`, `POS=PRON\|PronType=Dem`, `NumType=Card\|POS=NUM`, `Definite=Ind\|Degree=Pos\|Gender=Neut\|Number=Sing\|POS=ADJ`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Degree=Pos\|Gender=Com\|Number=Sing\|POS=ADJ`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `NumType=Ord\|POS=ADJ`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Mood=Ind\|POS=AUX\|Tense=Past\|VerbForm=Fin\|Voice=Act`, `POS=VERB\|VerbForm=Inf\|Voice=Act`, `Mood=Ind\|POS=VERB\|Tense=Past\|VerbForm=Fin\|Voice=Act`, `POS=NOUN`, `Mood=Ind\|POS=VERB\|Tense=Pres\|VerbForm=Fin\|Voice=Pass`, `POS=ADP\|PartType=Inf`, `Degree=Pos\|POS=ADJ`, `Definite=Def\|Gender=Com\|Number=Plur\|POS=NOUN`, `Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Definite=Def\|Gender=Com\|Number=Sing\|POS=NOUN`, `POS=AUX\|VerbForm=Inf\|Voice=Act`, `Definite=Ind\|Degree=Pos\|Gender=Com\|Number=Sing\|POS=ADJ`, `Gender=Com\|Number=Sing\|POS=DET\|PronType=Dem`, `Number=Plur\|POS=DET\|PronType=Ind`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Ind`, `Case=Acc\|POS=PRON\|Person=3\|PronType=Prs\|Reflex=Yes`, `POS=PART\|PartType=Inf`, `Gender=Neut\|Number=Sing\|POS=DET\|PronType=Ind`, `Case=Acc\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Case=Gen\|Definite=Def\|Gender=Neut\|Number=Sing\|POS=NOUN`, `Case=Nom\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Nom\|Gender=Com\|POS=PRON\|PronType=Ind`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Ind`, `Mood=Imp\|POS=VERB`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Definite=Ind\|Number=Sing\|POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=X`, `Case=Nom\|Gender=Com\|Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Gen\|Definite=Def\|Gender=Com\|Number=Plur\|POS=NOUN`, `POS=VERB\|Tense=Pres\|VerbForm=Part`, `Number=Plur\|POS=PRON\|PronType=Int,Rel`, `POS=VERB\|VerbForm=Inf\|Voice=Pass`, `Case=Gen\|Definite=Ind\|Gender=Com\|Number=Sing\|POS=NOUN`, `Degree=Cmp\|POS=ADV`, `POS=ADV\|PartType=Inf`, `Degree=Sup\|POS=ADV`, `Number=Plur\|POS=PRON\|PronType=Dem`, `Number=Plur\|POS=PRON\|PronType=Ind`, `Definite=Def\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Gen\|POS=PROPN`, `POS=ADP`, `Degree=Cmp\|Number=Plur\|POS=ADJ`, `Definite=Def\|Degree=Sup\|POS=ADJ`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Degree=Pos\|Number=Sing\|POS=ADJ`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Gender=Com\|Number=Sing\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Number=Plur\|POS=PRON\|PronType=Rcp`, `Case=Gen\|Degree=Cmp\|POS=ADJ`, `Case=Gen\|Definite=Def\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Number[psor]=Plur\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs`, `POS=INTJ`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Degree=Pos\|Gender=Neut\|Number=Sing\|POS=ADJ`, `Gender=Neut\|Number=Sing\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Definite=Ind\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Number=Sing\|POS=PRON\|PronType=Int,Rel`, `Number=Plur\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Int,Rel`, `Definite=Def\|Degree=Sup\|Number=Plur\|POS=ADJ`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Definite=Ind\|Number=Sing\|POS=NOUN`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Plur\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `POS=SYM`, `Case=Nom\|Gender=Com\|POS=PRON\|Person=2\|Polite=Form\|PronType=Prs`, `Degree=Sup\|POS=ADJ`, `Number=Plur\|POS=DET\|PronType=Ind\|Style=Arch`, `Case=Gen\|Gender=Com\|Number=Sing\|POS=DET\|PronType=Dem`, `Foreign=Yes\|POS=X`, `POS=DET\|Person=2\|Polite=Form\|Poss=Yes\|PronType=Prs`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Dem`, `Case=Acc\|Gender=Com\|Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Gen\|Definite=Ind\|Gender=Neut\|Number=Sing\|POS=NOUN`, `Case=Gen\|POS=PRON\|PronType=Int,Rel`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Dem`, `Abbr=Yes\|POS=X`, `Case=Gen\|Definite=Ind\|Gender=Com\|Number=Plur\|POS=NOUN`, `Definite=Def\|Degree=Abs\|POS=ADJ`, `Definite=Ind\|Degree=Sup\|Number=Sing\|POS=ADJ`, `Definite=Ind\|POS=NOUN`, `Gender=Com\|Number=Plur\|POS=NOUN`, `Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Gender=Com\|POS=PRON\|PronType=Int,Rel`, `Case=Nom\|Gender=Com\|Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Degree=Abs\|POS=ADV`, `POS=VERB\|VerbForm=Ger`, `POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Def\|Degree=Sup\|Number=Sing\|POS=ADJ`, `Number=Plur\|Number[psor]=Plur\|POS=PRON\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Case=Gen\|Definite=Def\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Case=Gen\|Degree=Pos\|Number=Plur\|POS=ADJ`, `Case=Acc\|Gender=Com\|POS=PRON\|Person=2\|Polite=Form\|PronType=Prs`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Int,Rel`, `POS=VERB\|Tense=Pres`, `Case=Gen\|Number=Plur\|POS=DET\|PronType=Ind`, `Number[psor]=Plur\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `POS=PRON\|Person=2\|Polite=Form\|Poss=Yes\|PronType=Prs`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `POS=AUX\|Tense=Pres\|VerbForm=Part`, `Mood=Ind\|POS=VERB\|Tense=Past\|VerbForm=Fin\|Voice=Pass`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Degree=Sup\|Number=Plur\|POS=ADJ`, `Case=Acc\|Gender=Com\|Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Definite=Ind\|Number=Plur\|POS=NOUN`, `Case=Gen\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Mood=Imp\|POS=AUX`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=1\|Poss=Yes\|PronType=Prs`, `Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `Definite=Def\|Gender=Com\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Gender=Com\|Number=Sing\|POS=DET\|PronType=Ind`, `Case=Gen\|POS=NOUN`, `Number[psor]=Plur\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `POS=DET\|PronType=Dem`, `Definite=Def\|Number=Plur\|POS=NOUN` |
|
84 |
| **`parser`** | `ROOT`, `acl:relcl`, `advcl`, `advmod`, `advmod:lmod`, `amod`, `appos`, `aux`, `case`, `cc`, `ccomp`, `compound:prt`, `conj`, `cop`, `dep`, `det`, `expl`, `fixed`, `flat`, `iobj`, `list`, `mark`, `nmod`, `nmod:poss`, `nsubj`, `nummod`, `obj`, `obl`, `obl:lmod`, `obl:tmod`, `punct`, `xcomp` |
|
85 |
-
| **`senter`** | `I`, `S` |
|
86 |
| **`ner`** | `LOC`, `MISC`, `ORG`, `PER` |
|
87 |
|
88 |
</details>
|
@@ -95,18 +109,18 @@ Danish pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, s
|
|
95 |
| `TOKEN_P` | 99.78 |
|
96 |
| `TOKEN_R` | 99.75 |
|
97 |
| `TOKEN_F` | 99.76 |
|
98 |
-
| `POS_ACC` | 96.
|
99 |
-
| `MORPH_ACC` | 95.
|
100 |
-
| `MORPH_MICRO_P` | 97.
|
101 |
-
| `MORPH_MICRO_R` | 96.
|
102 |
-
| `MORPH_MICRO_F` | 96.
|
103 |
-
| `SENTS_P` | 91.
|
104 |
-
| `SENTS_R` |
|
105 |
-
| `SENTS_F` | 90.
|
106 |
-
| `DEP_UAS` |
|
107 |
-
| `DEP_LAS` | 78.
|
108 |
-
| `
|
109 |
-
| `
|
110 |
-
| `ENTS_P` |
|
111 |
-
| `ENTS_R` |
|
112 |
-
| `ENTS_F` |
|
|
|
14 |
metrics:
|
15 |
- name: NER Precision
|
16 |
type: precision
|
17 |
+
value: 0.8183716075
|
18 |
- name: NER Recall
|
19 |
type: recall
|
20 |
+
value: 0.8166666667
|
21 |
- name: NER F Score
|
22 |
type: f_score
|
23 |
+
value: 0.8175182482
|
24 |
+
- task:
|
25 |
+
name: TAG
|
26 |
+
type: token-classification
|
27 |
+
metrics:
|
28 |
+
- name: TAG (XPOS) Accuracy
|
29 |
+
type: accuracy
|
30 |
+
value: 0.9633898305
|
31 |
- task:
|
32 |
name: POS
|
33 |
type: token-classification
|
34 |
metrics:
|
35 |
+
- name: POS (UPOS) Accuracy
|
36 |
type: accuracy
|
37 |
+
value: 0.9633898305
|
38 |
- task:
|
39 |
+
name: MORPH
|
40 |
type: token-classification
|
41 |
metrics:
|
42 |
+
- name: Morph (UFeats) Accuracy
|
43 |
+
type: accuracy
|
44 |
+
value: 0.9568038741
|
|
|
|
|
|
|
|
|
|
|
|
|
45 |
- task:
|
46 |
+
name: LEMMA
|
47 |
type: token-classification
|
48 |
metrics:
|
49 |
+
- name: Lemma Accuracy
|
50 |
type: accuracy
|
51 |
+
value: 0.9516707022
|
52 |
+
- task:
|
53 |
+
name: UNLABELED_DEPENDENCIES
|
54 |
+
type: token-classification
|
55 |
+
metrics:
|
56 |
+
- name: Unlabeled Attachment Score (UAS)
|
57 |
+
type: f_score
|
58 |
+
value: 0.8195787003
|
59 |
- task:
|
60 |
name: LABELED_DEPENDENCIES
|
61 |
type: token-classification
|
62 |
metrics:
|
63 |
+
- name: Labeled Attachment Score (LAS)
|
64 |
+
type: f_score
|
65 |
+
value: 0.7807576266
|
66 |
+
- task:
|
67 |
+
name: SENTS
|
68 |
+
type: token-classification
|
69 |
+
metrics:
|
70 |
+
- name: Sentences F-Score
|
71 |
+
type: f_score
|
72 |
+
value: 0.9055258467
|
73 |
---
|
74 |
### Details: https://spacy.io/models/da#da_core_news_lg
|
75 |
|
76 |
+
Danish pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, lemmatizer (trainable_lemmatizer), senter, ner, attribute_ruler.
|
77 |
|
78 |
| Feature | Description |
|
79 |
| --- | --- |
|
80 |
| **Name** | `da_core_news_lg` |
|
81 |
+
| **Version** | `3.3.0` |
|
82 |
+
| **spaCy** | `>=3.3.0.dev0,<3.4.0` |
|
83 |
+
| **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `lemmatizer`, `attribute_ruler`, `ner` |
|
84 |
+
| **Components** | `tok2vec`, `morphologizer`, `parser`, `lemmatizer`, `senter`, `attribute_ruler`, `ner` |
|
85 |
| **Vectors** | 500000 keys, 500000 unique vectors (300 dimensions) |
|
86 |
+
| **Sources** | [UD Danish DDT v2.8](https://github.com/UniversalDependencies/UD_Danish-DDT) (Johannsen, Anders; Martínez Alonso, Héctor; Plank, Barbara)<br />[DaNE](https://github.com/alexandrainst/danlp/blob/master/docs/datasets.md#danish-dependency-treebank-dane) (Rasmus Hvingelby, Amalie B. Pauli, Maria Barrett, Christina Rosted, Lasse M. Lidegaard, Anders Søgaard)<br />[Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia)](https://spacy.io) (Explosion) |
|
87 |
| **License** | `CC BY-SA 4.0` |
|
88 |
| **Author** | [Explosion](https://explosion.ai) |
|
89 |
|
|
|
91 |
|
92 |
<details>
|
93 |
|
94 |
+
<summary>View label scheme (193 labels for 3 components)</summary>
|
95 |
|
96 |
| Component | Labels |
|
97 |
| --- | --- |
|
98 |
| **`morphologizer`** | `AdpType=Prep\|POS=ADP`, `Definite=Ind\|Gender=Com\|Number=Sing\|POS=NOUN`, `Mood=Ind\|POS=AUX\|Tense=Pres\|VerbForm=Fin\|Voice=Act`, `POS=PROPN`, `Definite=Ind\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Def\|Gender=Neut\|Number=Sing\|POS=NOUN`, `POS=SCONJ`, `Definite=Def\|Gender=Com\|Number=Sing\|POS=NOUN`, `Mood=Ind\|POS=VERB\|Tense=Pres\|VerbForm=Fin\|Voice=Act`, `POS=ADV`, `Number=Plur\|POS=DET\|PronType=Dem`, `Degree=Pos\|Number=Plur\|POS=ADJ`, `Definite=Ind\|Gender=Com\|Number=Plur\|POS=NOUN`, `POS=PUNCT`, `POS=CCONJ`, `Definite=Ind\|Degree=Cmp\|Number=Sing\|POS=ADJ`, `Degree=Cmp\|POS=ADJ`, `POS=PRON\|PartType=Inf`, `Gender=Com\|Number=Sing\|POS=DET\|PronType=Ind`, `Definite=Ind\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Case=Acc\|Gender=Neut\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Definite=Ind\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Definite=Def\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Gender=Neut\|Number=Sing\|POS=DET\|PronType=Dem`, `Degree=Pos\|POS=ADV`, `Definite=Def\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Ind\|Gender=Neut\|Number=Sing\|POS=NOUN`, `POS=PRON\|PronType=Dem`, `NumType=Card\|POS=NUM`, `Definite=Ind\|Degree=Pos\|Gender=Neut\|Number=Sing\|POS=ADJ`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Degree=Pos\|Gender=Com\|Number=Sing\|POS=ADJ`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `NumType=Ord\|POS=ADJ`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Mood=Ind\|POS=AUX\|Tense=Past\|VerbForm=Fin\|Voice=Act`, `POS=VERB\|VerbForm=Inf\|Voice=Act`, `Mood=Ind\|POS=VERB\|Tense=Past\|VerbForm=Fin\|Voice=Act`, `POS=NOUN`, `Mood=Ind\|POS=VERB\|Tense=Pres\|VerbForm=Fin\|Voice=Pass`, `POS=ADP\|PartType=Inf`, `Degree=Pos\|POS=ADJ`, `Definite=Def\|Gender=Com\|Number=Plur\|POS=NOUN`, `Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Definite=Def\|Gender=Com\|Number=Sing\|POS=NOUN`, `POS=AUX\|VerbForm=Inf\|Voice=Act`, `Definite=Ind\|Degree=Pos\|Gender=Com\|Number=Sing\|POS=ADJ`, `Gender=Com\|Number=Sing\|POS=DET\|PronType=Dem`, `Number=Plur\|POS=DET\|PronType=Ind`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Ind`, `Case=Acc\|POS=PRON\|Person=3\|PronType=Prs\|Reflex=Yes`, `POS=PART\|PartType=Inf`, `Gender=Neut\|Number=Sing\|POS=DET\|PronType=Ind`, `Case=Acc\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Case=Gen\|Definite=Def\|Gender=Neut\|Number=Sing\|POS=NOUN`, `Case=Nom\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Nom\|Gender=Com\|POS=PRON\|PronType=Ind`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Ind`, `Mood=Imp\|POS=VERB`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Definite=Ind\|Number=Sing\|POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=X`, `Case=Nom\|Gender=Com\|Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Gen\|Definite=Def\|Gender=Com\|Number=Plur\|POS=NOUN`, `POS=VERB\|Tense=Pres\|VerbForm=Part`, `Number=Plur\|POS=PRON\|PronType=Int,Rel`, `POS=VERB\|VerbForm=Inf\|Voice=Pass`, `Case=Gen\|Definite=Ind\|Gender=Com\|Number=Sing\|POS=NOUN`, `Degree=Cmp\|POS=ADV`, `POS=ADV\|PartType=Inf`, `Degree=Sup\|POS=ADV`, `Number=Plur\|POS=PRON\|PronType=Dem`, `Number=Plur\|POS=PRON\|PronType=Ind`, `Definite=Def\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Gen\|POS=PROPN`, `POS=ADP`, `Degree=Cmp\|Number=Plur\|POS=ADJ`, `Definite=Def\|Degree=Sup\|POS=ADJ`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Degree=Pos\|Number=Sing\|POS=ADJ`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Gender=Com\|Number=Sing\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Number=Plur\|POS=PRON\|PronType=Rcp`, `Case=Gen\|Degree=Cmp\|POS=ADJ`, `Case=Gen\|Definite=Def\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Number[psor]=Plur\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs`, `POS=INTJ`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Degree=Pos\|Gender=Neut\|Number=Sing\|POS=ADJ`, `Gender=Neut\|Number=Sing\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Definite=Ind\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Number=Sing\|POS=PRON\|PronType=Int,Rel`, `Number=Plur\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Int,Rel`, `Definite=Def\|Degree=Sup\|Number=Plur\|POS=ADJ`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Definite=Ind\|Number=Sing\|POS=NOUN`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Plur\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `POS=SYM`, `Case=Nom\|Gender=Com\|POS=PRON\|Person=2\|Polite=Form\|PronType=Prs`, `Degree=Sup\|POS=ADJ`, `Number=Plur\|POS=DET\|PronType=Ind\|Style=Arch`, `Case=Gen\|Gender=Com\|Number=Sing\|POS=DET\|PronType=Dem`, `Foreign=Yes\|POS=X`, `POS=DET\|Person=2\|Polite=Form\|Poss=Yes\|PronType=Prs`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Dem`, `Case=Acc\|Gender=Com\|Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Gen\|Definite=Ind\|Gender=Neut\|Number=Sing\|POS=NOUN`, `Case=Gen\|POS=PRON\|PronType=Int,Rel`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Dem`, `Abbr=Yes\|POS=X`, `Case=Gen\|Definite=Ind\|Gender=Com\|Number=Plur\|POS=NOUN`, `Definite=Def\|Degree=Abs\|POS=ADJ`, `Definite=Ind\|Degree=Sup\|Number=Sing\|POS=ADJ`, `Definite=Ind\|POS=NOUN`, `Gender=Com\|Number=Plur\|POS=NOUN`, `Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Gender=Com\|POS=PRON\|PronType=Int,Rel`, `Case=Nom\|Gender=Com\|Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Degree=Abs\|POS=ADV`, `POS=VERB\|VerbForm=Ger`, `POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Def\|Degree=Sup\|Number=Sing\|POS=ADJ`, `Number=Plur\|Number[psor]=Plur\|POS=PRON\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Case=Gen\|Definite=Def\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Case=Gen\|Degree=Pos\|Number=Plur\|POS=ADJ`, `Case=Acc\|Gender=Com\|POS=PRON\|Person=2\|Polite=Form\|PronType=Prs`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Int,Rel`, `POS=VERB\|Tense=Pres`, `Case=Gen\|Number=Plur\|POS=DET\|PronType=Ind`, `Number[psor]=Plur\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `POS=PRON\|Person=2\|Polite=Form\|Poss=Yes\|PronType=Prs`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `POS=AUX\|Tense=Pres\|VerbForm=Part`, `Mood=Ind\|POS=VERB\|Tense=Past\|VerbForm=Fin\|Voice=Pass`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Degree=Sup\|Number=Plur\|POS=ADJ`, `Case=Acc\|Gender=Com\|Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Definite=Ind\|Number=Plur\|POS=NOUN`, `Case=Gen\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Mood=Imp\|POS=AUX`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=1\|Poss=Yes\|PronType=Prs`, `Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `Definite=Def\|Gender=Com\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Gender=Com\|Number=Sing\|POS=DET\|PronType=Ind`, `Case=Gen\|POS=NOUN`, `Number[psor]=Plur\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `POS=DET\|PronType=Dem`, `Definite=Def\|Number=Plur\|POS=NOUN` |
|
99 |
| **`parser`** | `ROOT`, `acl:relcl`, `advcl`, `advmod`, `advmod:lmod`, `amod`, `appos`, `aux`, `case`, `cc`, `ccomp`, `compound:prt`, `conj`, `cop`, `dep`, `det`, `expl`, `fixed`, `flat`, `iobj`, `list`, `mark`, `nmod`, `nmod:poss`, `nsubj`, `nummod`, `obj`, `obl`, `obl:lmod`, `obl:tmod`, `punct`, `xcomp` |
|
|
|
100 |
| **`ner`** | `LOC`, `MISC`, `ORG`, `PER` |
|
101 |
|
102 |
</details>
|
|
|
109 |
| `TOKEN_P` | 99.78 |
|
110 |
| `TOKEN_R` | 99.75 |
|
111 |
| `TOKEN_F` | 99.76 |
|
112 |
+
| `POS_ACC` | 96.34 |
|
113 |
+
| `MORPH_ACC` | 95.68 |
|
114 |
+
| `MORPH_MICRO_P` | 97.27 |
|
115 |
+
| `MORPH_MICRO_R` | 96.56 |
|
116 |
+
| `MORPH_MICRO_F` | 96.91 |
|
117 |
+
| `SENTS_P` | 91.04 |
|
118 |
+
| `SENTS_R` | 90.07 |
|
119 |
+
| `SENTS_F` | 90.55 |
|
120 |
+
| `DEP_UAS` | 81.96 |
|
121 |
+
| `DEP_LAS` | 78.08 |
|
122 |
+
| `LEMMA_ACC` | 95.17 |
|
123 |
+
| `TAG_ACC` | 96.34 |
|
124 |
+
| `ENTS_P` | 81.84 |
|
125 |
+
| `ENTS_R` | 81.67 |
|
126 |
+
| `ENTS_F` | 81.75 |
|
accuracy.json
CHANGED
@@ -3,81 +3,81 @@
|
|
3 |
"token_p": 0.9977732598,
|
4 |
"token_r": 0.9974835463,
|
5 |
"token_f": 0.997628382,
|
6 |
-
"pos_acc": 0.
|
7 |
-
"morph_acc": 0.
|
8 |
-
"morph_micro_p": 0.
|
9 |
-
"morph_micro_r": 0.
|
10 |
-
"morph_micro_f": 0.
|
11 |
"morph_per_feat": {
|
12 |
"Mood": {
|
13 |
-
"p": 0.
|
14 |
-
"r": 0.
|
15 |
-
"f": 0.
|
16 |
},
|
17 |
"Tense": {
|
18 |
-
"p": 0.
|
19 |
-
"r": 0.
|
20 |
-
"f": 0.
|
21 |
},
|
22 |
"VerbForm": {
|
23 |
-
"p": 0.
|
24 |
-
"r": 0.
|
25 |
-
"f": 0.
|
26 |
},
|
27 |
"Voice": {
|
28 |
-
"p": 0.
|
29 |
-
"r": 0.
|
30 |
-
"f": 0.
|
31 |
},
|
32 |
"Definite": {
|
33 |
-
"p": 0.
|
34 |
-
"r": 0.
|
35 |
-
"f": 0.
|
36 |
},
|
37 |
"Gender": {
|
38 |
-
"p": 0.
|
39 |
-
"r": 0.
|
40 |
-
"f": 0.
|
41 |
},
|
42 |
"Number": {
|
43 |
-
"p": 0.
|
44 |
-
"r": 0.
|
45 |
-
"f": 0.
|
46 |
},
|
47 |
"AdpType": {
|
48 |
-
"p": 0.
|
49 |
-
"r": 0.
|
50 |
-
"f": 0.
|
51 |
},
|
52 |
"PartType": {
|
53 |
-
"p":
|
54 |
"r": 1.0,
|
55 |
-
"f":
|
56 |
},
|
57 |
"Case": {
|
58 |
-
"p": 0.
|
59 |
-
"r": 0.
|
60 |
-
"f": 0.
|
61 |
},
|
62 |
"Person": {
|
63 |
-
"p": 0.
|
64 |
-
"r": 0.
|
65 |
-
"f": 0.
|
66 |
},
|
67 |
"PronType": {
|
68 |
-
"p": 0.
|
69 |
-
"r": 0.
|
70 |
-
"f": 0.
|
71 |
},
|
72 |
"NumType": {
|
73 |
-
"p": 0.
|
74 |
"r": 0.9602649007,
|
75 |
-
"f": 0.
|
76 |
},
|
77 |
"Degree": {
|
78 |
-
"p": 0.
|
79 |
-
"r": 0.
|
80 |
-
"f": 0.
|
81 |
},
|
82 |
"Reflex": {
|
83 |
"p": 1.0,
|
@@ -85,24 +85,24 @@
|
|
85 |
"f": 1.0
|
86 |
},
|
87 |
"Number[psor]": {
|
88 |
-
"p": 0.
|
89 |
"r": 1.0,
|
90 |
-
"f": 0.
|
91 |
},
|
92 |
"Poss": {
|
93 |
-
"p":
|
94 |
"r": 1.0,
|
95 |
-
"f":
|
96 |
},
|
97 |
"Foreign": {
|
98 |
-
"p": 0.
|
99 |
-
"r": 0.
|
100 |
-
"f": 0.
|
101 |
},
|
102 |
"Abbr": {
|
103 |
-
"p":
|
104 |
-
"r": 0.
|
105 |
-
"f": 0.
|
106 |
},
|
107 |
"Style": {
|
108 |
"p": 1.0,
|
@@ -110,146 +110,151 @@
|
|
110 |
"f": 1.0
|
111 |
},
|
112 |
"Polite": {
|
113 |
-
"p":
|
114 |
-
"r": 0.
|
115 |
-
"f": 0.
|
116 |
}
|
117 |
},
|
118 |
-
"sents_p": 0.
|
119 |
-
"sents_r": 0.
|
120 |
-
"sents_f": 0.
|
121 |
-
"dep_uas": 0.
|
122 |
-
"dep_las": 0.
|
123 |
"dep_las_per_type": {
|
124 |
"advmod": {
|
125 |
-
"p": 0.
|
126 |
-
"r": 0.
|
127 |
-
"f": 0.
|
128 |
},
|
129 |
"root": {
|
130 |
-
"p": 0.
|
131 |
-
"r": 0.
|
132 |
-
"f": 0.
|
133 |
},
|
134 |
"nsubj": {
|
135 |
-
"p": 0.
|
136 |
-
"r": 0.
|
137 |
-
"f": 0.
|
138 |
},
|
139 |
"case": {
|
140 |
-
"p": 0.
|
141 |
-
"r": 0.
|
142 |
-
"f": 0.
|
143 |
},
|
144 |
"obl": {
|
145 |
-
"p": 0.
|
146 |
"r": 0.6739130435,
|
147 |
-
"f": 0.
|
148 |
},
|
149 |
"cc": {
|
150 |
-
"p": 0.
|
151 |
-
"r": 0.
|
152 |
-
"f": 0.
|
153 |
},
|
154 |
"conj": {
|
155 |
-
"p": 0.
|
156 |
-
"r": 0.
|
157 |
-
"f": 0.
|
158 |
},
|
159 |
"obj": {
|
160 |
-
"p": 0.
|
161 |
-
"r": 0.
|
162 |
-
"f": 0.
|
163 |
},
|
164 |
"aux": {
|
165 |
-
"p": 0.
|
166 |
-
"r": 0.
|
167 |
-
"f": 0.
|
168 |
},
|
169 |
"acl:relcl": {
|
170 |
-
"p": 0.
|
171 |
-
"r": 0.
|
172 |
-
"f": 0.
|
173 |
},
|
174 |
"advmod:lmod": {
|
175 |
-
"p": 0.
|
176 |
-
"r": 0.
|
177 |
-
"f": 0.
|
178 |
},
|
179 |
"det": {
|
180 |
-
"p": 0.
|
181 |
-
"r": 0.
|
182 |
-
"f": 0.
|
183 |
},
|
184 |
"amod": {
|
185 |
-
"p": 0.
|
186 |
-
"r": 0.
|
187 |
-
"f": 0.
|
188 |
},
|
189 |
"nmod:poss": {
|
190 |
-
"p": 0.
|
191 |
-
"r": 0.
|
192 |
-
"f": 0.
|
193 |
},
|
194 |
"ccomp": {
|
195 |
-
"p": 0.
|
196 |
-
"r": 0.
|
197 |
-
"f": 0.
|
198 |
},
|
199 |
"nummod": {
|
200 |
-
"p": 0.
|
201 |
-
"r": 0.
|
202 |
-
"f": 0.
|
203 |
},
|
204 |
"flat": {
|
205 |
-
"p": 0.
|
206 |
-
"r": 0.
|
207 |
-
"f": 0.
|
208 |
},
|
209 |
"compound:prt": {
|
210 |
-
"p": 0.
|
211 |
-
"r": 0.
|
212 |
-
"f": 0.
|
213 |
},
|
214 |
"advcl": {
|
215 |
-
"p": 0.
|
216 |
-
"r": 0.
|
217 |
-
"f": 0.
|
218 |
},
|
219 |
"mark": {
|
220 |
-
"p": 0.
|
221 |
-
"r": 0.
|
222 |
-
"f": 0.
|
223 |
},
|
224 |
"cop": {
|
225 |
-
"p": 0.
|
226 |
-
"r": 0.
|
227 |
-
"f": 0.
|
228 |
},
|
229 |
"dep": {
|
230 |
-
"p": 0.
|
231 |
-
"r": 0.
|
232 |
-
"f": 0.
|
233 |
},
|
234 |
"nmod": {
|
235 |
-
"p": 0.
|
236 |
-
"r": 0.
|
237 |
-
"f": 0.
|
238 |
},
|
239 |
"iobj": {
|
240 |
-
"p": 0.
|
241 |
-
"r": 0.
|
242 |
-
"f": 0.
|
243 |
},
|
244 |
"xcomp": {
|
245 |
-
"p": 0.
|
246 |
-
"r": 0.
|
247 |
-
"f": 0.
|
|
|
|
|
|
|
|
|
|
|
248 |
},
|
249 |
"list": {
|
250 |
-
"p": 0.
|
251 |
"r": 0.3333333333,
|
252 |
-
"f": 0.
|
253 |
},
|
254 |
"vocative": {
|
255 |
"p": 0.0,
|
@@ -257,62 +262,57 @@
|
|
257 |
"f": 0.0
|
258 |
},
|
259 |
"fixed": {
|
260 |
-
"p": 0.
|
261 |
-
"r": 0.
|
262 |
-
"f": 0.
|
263 |
},
|
264 |
-
"
|
265 |
-
"p": 0.
|
266 |
-
"r": 0.
|
267 |
-
"f": 0.
|
268 |
},
|
269 |
-
"
|
270 |
-
"p": 0.
|
271 |
-
"r": 0.
|
272 |
-
"f": 0.
|
273 |
},
|
274 |
"obl:tmod": {
|
275 |
-
"p": 0.
|
276 |
-
"r": 0.
|
277 |
-
"f": 0.
|
278 |
},
|
279 |
"discourse": {
|
280 |
"p": 0.0,
|
281 |
"r": 0.0,
|
282 |
"f": 0.0
|
283 |
-
},
|
284 |
-
"obl:lmod": {
|
285 |
-
"p": 0.0,
|
286 |
-
"r": 0.0,
|
287 |
-
"f": 0.0
|
288 |
}
|
289 |
},
|
290 |
-
"
|
291 |
-
"
|
292 |
-
"ents_p": 0.
|
293 |
-
"ents_r": 0.
|
294 |
-
"ents_f": 0.
|
295 |
"ents_per_type": {
|
296 |
"PER": {
|
297 |
-
"p": 0.
|
298 |
-
"r": 0.
|
299 |
-
"f": 0.
|
300 |
},
|
301 |
"ORG": {
|
302 |
-
"p": 0.
|
303 |
-
"r": 0.
|
304 |
-
"f": 0.
|
305 |
},
|
306 |
"MISC": {
|
307 |
-
"p": 0.
|
308 |
-
"r": 0.
|
309 |
-
"f": 0.
|
310 |
},
|
311 |
"LOC": {
|
312 |
-
"p": 0.
|
313 |
-
"r": 0.
|
314 |
-
"f": 0.
|
315 |
}
|
316 |
},
|
317 |
-
"speed":
|
318 |
}
|
|
|
3 |
"token_p": 0.9977732598,
|
4 |
"token_r": 0.9974835463,
|
5 |
"token_f": 0.997628382,
|
6 |
+
"pos_acc": 0.9633898305,
|
7 |
+
"morph_acc": 0.9568038741,
|
8 |
+
"morph_micro_p": 0.9727434528,
|
9 |
+
"morph_micro_r": 0.9655746807,
|
10 |
+
"morph_micro_f": 0.9691458101,
|
11 |
"morph_per_feat": {
|
12 |
"Mood": {
|
13 |
+
"p": 0.9799043062,
|
14 |
+
"r": 0.9761677788,
|
15 |
+
"f": 0.9780324737
|
16 |
},
|
17 |
"Tense": {
|
18 |
+
"p": 0.9772727273,
|
19 |
+
"r": 0.9713855422,
|
20 |
+
"f": 0.9743202417
|
21 |
},
|
22 |
"VerbForm": {
|
23 |
+
"p": 0.9686153846,
|
24 |
+
"r": 0.9632802938,
|
25 |
+
"f": 0.9659404725
|
26 |
},
|
27 |
"Voice": {
|
28 |
+
"p": 0.9798206278,
|
29 |
+
"r": 0.9798206278,
|
30 |
+
"f": 0.9798206278
|
31 |
},
|
32 |
"Definite": {
|
33 |
+
"p": 0.968812475,
|
34 |
+
"r": 0.9573291189,
|
35 |
+
"f": 0.963036566
|
36 |
},
|
37 |
"Gender": {
|
38 |
+
"p": 0.9597720416,
|
39 |
+
"r": 0.9514788966,
|
40 |
+
"f": 0.9556074766
|
41 |
},
|
42 |
"Number": {
|
43 |
+
"p": 0.9683961022,
|
44 |
+
"r": 0.9590505999,
|
45 |
+
"f": 0.9637006945
|
46 |
},
|
47 |
"AdpType": {
|
48 |
+
"p": 0.9982206406,
|
49 |
+
"r": 0.9920424403,
|
50 |
+
"f": 0.9951219512
|
51 |
},
|
52 |
"PartType": {
|
53 |
+
"p": 0.996763754,
|
54 |
"r": 1.0,
|
55 |
+
"f": 0.9983792545
|
56 |
},
|
57 |
"Case": {
|
58 |
+
"p": 0.9806451613,
|
59 |
+
"r": 0.9605055292,
|
60 |
+
"f": 0.9704708699
|
61 |
},
|
62 |
"Person": {
|
63 |
+
"p": 0.9804270463,
|
64 |
+
"r": 0.9786856128,
|
65 |
+
"f": 0.9795555556
|
66 |
},
|
67 |
"PronType": {
|
68 |
+
"p": 0.9835390947,
|
69 |
+
"r": 0.9827302632,
|
70 |
+
"f": 0.9831345125
|
71 |
},
|
72 |
"NumType": {
|
73 |
+
"p": 0.9931506849,
|
74 |
"r": 0.9602649007,
|
75 |
+
"f": 0.9764309764
|
76 |
},
|
77 |
"Degree": {
|
78 |
+
"p": 0.9578313253,
|
79 |
+
"r": 0.9578313253,
|
80 |
+
"f": 0.9578313253
|
81 |
},
|
82 |
"Reflex": {
|
83 |
"p": 1.0,
|
|
|
85 |
"f": 1.0
|
86 |
},
|
87 |
"Number[psor]": {
|
88 |
+
"p": 0.9772727273,
|
89 |
"r": 1.0,
|
90 |
+
"f": 0.9885057471
|
91 |
},
|
92 |
"Poss": {
|
93 |
+
"p": 0.9887640449,
|
94 |
"r": 1.0,
|
95 |
+
"f": 0.9943502825
|
96 |
},
|
97 |
"Foreign": {
|
98 |
+
"p": 0.6,
|
99 |
+
"r": 0.3,
|
100 |
+
"f": 0.4
|
101 |
},
|
102 |
"Abbr": {
|
103 |
+
"p": 0.0,
|
104 |
+
"r": 0.0,
|
105 |
+
"f": 0.0
|
106 |
},
|
107 |
"Style": {
|
108 |
"p": 1.0,
|
|
|
110 |
"f": 1.0
|
111 |
},
|
112 |
"Polite": {
|
113 |
+
"p": 0.75,
|
114 |
+
"r": 0.75,
|
115 |
+
"f": 0.75
|
116 |
}
|
117 |
},
|
118 |
+
"sents_p": 0.9103942652,
|
119 |
+
"sents_r": 0.9007092199,
|
120 |
+
"sents_f": 0.9055258467,
|
121 |
+
"dep_uas": 0.8195787003,
|
122 |
+
"dep_las": 0.7807576266,
|
123 |
"dep_las_per_type": {
|
124 |
"advmod": {
|
125 |
+
"p": 0.6955345061,
|
126 |
+
"r": 0.7259887006,
|
127 |
+
"f": 0.7104353836
|
128 |
},
|
129 |
"root": {
|
130 |
+
"p": 0.824686941,
|
131 |
+
"r": 0.8173758865,
|
132 |
+
"f": 0.821015138
|
133 |
},
|
134 |
"nsubj": {
|
135 |
+
"p": 0.8361884368,
|
136 |
+
"r": 0.8238396624,
|
137 |
+
"f": 0.829968119
|
138 |
},
|
139 |
"case": {
|
140 |
+
"p": 0.9003984064,
|
141 |
+
"r": 0.8915187377,
|
142 |
+
"f": 0.8959365709
|
143 |
},
|
144 |
"obl": {
|
145 |
+
"p": 0.7221297837,
|
146 |
"r": 0.6739130435,
|
147 |
+
"f": 0.697188755
|
148 |
},
|
149 |
"cc": {
|
150 |
+
"p": 0.7630057803,
|
151 |
+
"r": 0.7674418605,
|
152 |
+
"f": 0.7652173913
|
153 |
},
|
154 |
"conj": {
|
155 |
+
"p": 0.6106442577,
|
156 |
+
"r": 0.5813333333,
|
157 |
+
"f": 0.5956284153
|
158 |
},
|
159 |
"obj": {
|
160 |
+
"p": 0.7893772894,
|
161 |
+
"r": 0.8368932039,
|
162 |
+
"f": 0.8124410933
|
163 |
},
|
164 |
"aux": {
|
165 |
+
"p": 0.8764705882,
|
166 |
+
"r": 0.8688046647,
|
167 |
+
"f": 0.8726207906
|
168 |
},
|
169 |
"acl:relcl": {
|
170 |
+
"p": 0.6300578035,
|
171 |
+
"r": 0.5891891892,
|
172 |
+
"f": 0.6089385475
|
173 |
},
|
174 |
"advmod:lmod": {
|
175 |
+
"p": 0.7272727273,
|
176 |
+
"r": 0.7164179104,
|
177 |
+
"f": 0.7218045113
|
178 |
},
|
179 |
"det": {
|
180 |
+
"p": 0.9140495868,
|
181 |
+
"r": 0.9110378913,
|
182 |
+
"f": 0.9125412541
|
183 |
},
|
184 |
"amod": {
|
185 |
+
"p": 0.8080645161,
|
186 |
+
"r": 0.8549488055,
|
187 |
+
"f": 0.8308457711
|
188 |
},
|
189 |
"nmod:poss": {
|
190 |
+
"p": 0.7373737374,
|
191 |
+
"r": 0.7227722772,
|
192 |
+
"f": 0.73
|
193 |
},
|
194 |
"ccomp": {
|
195 |
+
"p": 0.7068965517,
|
196 |
+
"r": 0.6612903226,
|
197 |
+
"f": 0.6833333333
|
198 |
},
|
199 |
"nummod": {
|
200 |
+
"p": 0.8360655738,
|
201 |
+
"r": 0.85,
|
202 |
+
"f": 0.8429752066
|
203 |
},
|
204 |
"flat": {
|
205 |
+
"p": 0.7844311377,
|
206 |
+
"r": 0.8675496689,
|
207 |
+
"f": 0.8238993711
|
208 |
},
|
209 |
"compound:prt": {
|
210 |
+
"p": 0.5,
|
211 |
+
"r": 0.2926829268,
|
212 |
+
"f": 0.3692307692
|
213 |
},
|
214 |
"advcl": {
|
215 |
+
"p": 0.6545454545,
|
216 |
+
"r": 0.6206896552,
|
217 |
+
"f": 0.6371681416
|
218 |
},
|
219 |
"mark": {
|
220 |
+
"p": 0.8781512605,
|
221 |
+
"r": 0.8583162218,
|
222 |
+
"f": 0.8681204569
|
223 |
},
|
224 |
"cop": {
|
225 |
+
"p": 0.8121546961,
|
226 |
+
"r": 0.84,
|
227 |
+
"f": 0.8258426966
|
228 |
},
|
229 |
"dep": {
|
230 |
+
"p": 0.145631068,
|
231 |
+
"r": 0.2830188679,
|
232 |
+
"f": 0.1923076923
|
233 |
},
|
234 |
"nmod": {
|
235 |
+
"p": 0.6549707602,
|
236 |
+
"r": 0.65625,
|
237 |
+
"f": 0.6556097561
|
238 |
},
|
239 |
"iobj": {
|
240 |
+
"p": 0.8125,
|
241 |
+
"r": 0.5909090909,
|
242 |
+
"f": 0.6842105263
|
243 |
},
|
244 |
"xcomp": {
|
245 |
+
"p": 0.4772727273,
|
246 |
+
"r": 0.3559322034,
|
247 |
+
"f": 0.4077669903
|
248 |
+
},
|
249 |
+
"appos": {
|
250 |
+
"p": 0.5384615385,
|
251 |
+
"r": 0.4242424242,
|
252 |
+
"f": 0.4745762712
|
253 |
},
|
254 |
"list": {
|
255 |
+
"p": 0.5,
|
256 |
"r": 0.3333333333,
|
257 |
+
"f": 0.4
|
258 |
},
|
259 |
"vocative": {
|
260 |
"p": 0.0,
|
|
|
262 |
"f": 0.0
|
263 |
},
|
264 |
"fixed": {
|
265 |
+
"p": 0.8717948718,
|
266 |
+
"r": 0.8292682927,
|
267 |
+
"f": 0.85
|
268 |
},
|
269 |
+
"obl:lmod": {
|
270 |
+
"p": 0.0,
|
271 |
+
"r": 0.0,
|
272 |
+
"f": 0.0
|
273 |
},
|
274 |
+
"expl": {
|
275 |
+
"p": 0.8529411765,
|
276 |
+
"r": 0.8529411765,
|
277 |
+
"f": 0.8529411765
|
278 |
},
|
279 |
"obl:tmod": {
|
280 |
+
"p": 0.6363636364,
|
281 |
+
"r": 0.3888888889,
|
282 |
+
"f": 0.4827586207
|
283 |
},
|
284 |
"discourse": {
|
285 |
"p": 0.0,
|
286 |
"r": 0.0,
|
287 |
"f": 0.0
|
|
|
|
|
|
|
|
|
|
|
288 |
}
|
289 |
},
|
290 |
+
"lemma_acc": 0.9516707022,
|
291 |
+
"tag_acc": 0.9633898305,
|
292 |
+
"ents_p": 0.8183716075,
|
293 |
+
"ents_r": 0.8166666667,
|
294 |
+
"ents_f": 0.8175182482,
|
295 |
"ents_per_type": {
|
296 |
"PER": {
|
297 |
+
"p": 0.8993710692,
|
298 |
+
"r": 0.8614457831,
|
299 |
+
"f": 0.88
|
300 |
},
|
301 |
"ORG": {
|
302 |
+
"p": 0.7303370787,
|
303 |
+
"r": 0.7222222222,
|
304 |
+
"f": 0.7262569832
|
305 |
},
|
306 |
"MISC": {
|
307 |
+
"p": 0.7288135593,
|
308 |
+
"r": 0.7610619469,
|
309 |
+
"f": 0.7445887446
|
310 |
},
|
311 |
"LOC": {
|
312 |
+
"p": 0.8672566372,
|
313 |
+
"r": 0.8828828829,
|
314 |
+
"f": 0.875
|
315 |
}
|
316 |
},
|
317 |
+
"speed": 10791.2692595094
|
318 |
}
|
attribute_ruler/patterns
CHANGED
Binary files a/attribute_ruler/patterns and b/attribute_ruler/patterns differ
|
|
config.cfg
CHANGED
@@ -10,7 +10,7 @@ seed = 0
|
|
10 |
|
11 |
[nlp]
|
12 |
lang = "da"
|
13 |
-
pipeline = ["tok2vec","morphologizer","parser","
|
14 |
disabled = ["senter"]
|
15 |
before_creation = null
|
16 |
after_creation = null
|
@@ -26,11 +26,22 @@ scorer = {"@scorers":"spacy.attribute_ruler_scorer.v1"}
|
|
26 |
validate = false
|
27 |
|
28 |
[components.lemmatizer]
|
29 |
-
factory = "
|
30 |
-
|
31 |
-
|
32 |
overwrite = false
|
33 |
scorer = {"@scorers":"spacy.lemmatizer_scorer.v1"}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
34 |
|
35 |
[components.morphologizer]
|
36 |
factory = "morphologizer"
|
@@ -39,8 +50,9 @@ overwrite = true
|
|
39 |
scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
|
40 |
|
41 |
[components.morphologizer.model]
|
42 |
-
@architectures = "spacy.Tagger.
|
43 |
nO = null
|
|
|
44 |
|
45 |
[components.morphologizer.model.tok2vec]
|
46 |
@architectures = "spacy.Tok2VecListener.v1"
|
@@ -70,7 +82,7 @@ nO = null
|
|
70 |
@architectures = "spacy.MultiHashEmbed.v2"
|
71 |
width = 96
|
72 |
attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
|
73 |
-
rows = [5000,
|
74 |
include_static_vectors = true
|
75 |
|
76 |
[components.ner.model.tok2vec.encode]
|
@@ -108,8 +120,9 @@ overwrite = false
|
|
108 |
scorer = {"@scorers":"spacy.senter_scorer.v1"}
|
109 |
|
110 |
[components.senter.model]
|
111 |
-
@architectures = "spacy.Tagger.
|
112 |
nO = null
|
|
|
113 |
|
114 |
[components.senter.model.tok2vec]
|
115 |
@architectures = "spacy.Tok2Vec.v2"
|
@@ -138,7 +151,7 @@ factory = "tok2vec"
|
|
138 |
@architectures = "spacy.MultiHashEmbed.v2"
|
139 |
width = ${components.tok2vec.model.encode:width}
|
140 |
attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
|
141 |
-
rows = [5000,
|
142 |
include_static_vectors = true
|
143 |
|
144 |
[components.tok2vec.model.encode]
|
@@ -175,7 +188,7 @@ dropout = 0.1
|
|
175 |
accumulate_gradient = 1
|
176 |
patience = 5000
|
177 |
max_epochs = 0
|
178 |
-
max_steps =
|
179 |
eval_frequency = 1000
|
180 |
frozen_components = []
|
181 |
before_to_disk = null
|
@@ -210,17 +223,17 @@ eps = 0.00000001
|
|
210 |
learn_rate = 0.001
|
211 |
|
212 |
[training.score_weights]
|
213 |
-
pos_acc = 0.
|
214 |
-
morph_acc = 0.
|
215 |
morph_per_feat = null
|
216 |
dep_uas = 0.0
|
217 |
-
dep_las = 0.
|
218 |
dep_las_per_type = null
|
219 |
sents_p = null
|
220 |
sents_r = null
|
221 |
-
sents_f = 0.
|
222 |
-
lemma_acc = 0.
|
223 |
-
ents_f = 0.
|
224 |
ents_p = 0.0
|
225 |
ents_r = 0.0
|
226 |
ents_per_type = null
|
@@ -237,6 +250,13 @@ after_init = null
|
|
237 |
|
238 |
[initialize.components]
|
239 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
240 |
[initialize.components.morphologizer]
|
241 |
|
242 |
[initialize.components.morphologizer.labels]
|
|
|
10 |
|
11 |
[nlp]
|
12 |
lang = "da"
|
13 |
+
pipeline = ["tok2vec","morphologizer","parser","lemmatizer","senter","attribute_ruler","ner"]
|
14 |
disabled = ["senter"]
|
15 |
before_creation = null
|
16 |
after_creation = null
|
|
|
26 |
validate = false
|
27 |
|
28 |
[components.lemmatizer]
|
29 |
+
factory = "trainable_lemmatizer"
|
30 |
+
backoff = "orth"
|
31 |
+
min_tree_freq = 3
|
32 |
overwrite = false
|
33 |
scorer = {"@scorers":"spacy.lemmatizer_scorer.v1"}
|
34 |
+
top_k = 1
|
35 |
+
|
36 |
+
[components.lemmatizer.model]
|
37 |
+
@architectures = "spacy.Tagger.v2"
|
38 |
+
nO = null
|
39 |
+
normalize = false
|
40 |
+
|
41 |
+
[components.lemmatizer.model.tok2vec]
|
42 |
+
@architectures = "spacy.Tok2VecListener.v1"
|
43 |
+
width = ${components.tok2vec.model.encode:width}
|
44 |
+
upstream = "tok2vec"
|
45 |
|
46 |
[components.morphologizer]
|
47 |
factory = "morphologizer"
|
|
|
50 |
scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
|
51 |
|
52 |
[components.morphologizer.model]
|
53 |
+
@architectures = "spacy.Tagger.v2"
|
54 |
nO = null
|
55 |
+
normalize = false
|
56 |
|
57 |
[components.morphologizer.model.tok2vec]
|
58 |
@architectures = "spacy.Tok2VecListener.v1"
|
|
|
82 |
@architectures = "spacy.MultiHashEmbed.v2"
|
83 |
width = 96
|
84 |
attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
|
85 |
+
rows = [5000,1000,2500,2500,50]
|
86 |
include_static_vectors = true
|
87 |
|
88 |
[components.ner.model.tok2vec.encode]
|
|
|
120 |
scorer = {"@scorers":"spacy.senter_scorer.v1"}
|
121 |
|
122 |
[components.senter.model]
|
123 |
+
@architectures = "spacy.Tagger.v2"
|
124 |
nO = null
|
125 |
+
normalize = false
|
126 |
|
127 |
[components.senter.model.tok2vec]
|
128 |
@architectures = "spacy.Tok2Vec.v2"
|
|
|
151 |
@architectures = "spacy.MultiHashEmbed.v2"
|
152 |
width = ${components.tok2vec.model.encode:width}
|
153 |
attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
|
154 |
+
rows = [5000,1000,2500,2500,50]
|
155 |
include_static_vectors = true
|
156 |
|
157 |
[components.tok2vec.model.encode]
|
|
|
188 |
accumulate_gradient = 1
|
189 |
patience = 5000
|
190 |
max_epochs = 0
|
191 |
+
max_steps = 100000
|
192 |
eval_frequency = 1000
|
193 |
frozen_components = []
|
194 |
before_to_disk = null
|
|
|
223 |
learn_rate = 0.001
|
224 |
|
225 |
[training.score_weights]
|
226 |
+
pos_acc = 0.14
|
227 |
+
morph_acc = 0.14
|
228 |
morph_per_feat = null
|
229 |
dep_uas = 0.0
|
230 |
+
dep_las = 0.29
|
231 |
dep_las_per_type = null
|
232 |
sents_p = null
|
233 |
sents_r = null
|
234 |
+
sents_f = 0.04
|
235 |
+
lemma_acc = 0.1
|
236 |
+
ents_f = 0.29
|
237 |
ents_p = 0.0
|
238 |
ents_r = 0.0
|
239 |
ents_per_type = null
|
|
|
250 |
|
251 |
[initialize.components]
|
252 |
|
253 |
+
[initialize.components.lemmatizer]
|
254 |
+
|
255 |
+
[initialize.components.lemmatizer.labels]
|
256 |
+
@readers = "spacy.read_labels.v1"
|
257 |
+
path = "corpus/labels/trainable_lemmatizer.json"
|
258 |
+
require = false
|
259 |
+
|
260 |
[initialize.components.morphologizer]
|
261 |
|
262 |
[initialize.components.morphologizer.labels]
|
da_core_news_lg-any-py3-none-any.whl
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:70b519f9d735120b8eee33370806c00bdb12214887bf6793783421fbbbff1dc3
|
3 |
+
size 567085252
|
lemmatizer/cfg
ADDED
@@ -0,0 +1,457 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"labels":[
|
3 |
+
1,
|
4 |
+
2,
|
5 |
+
4,
|
6 |
+
6,
|
7 |
+
8,
|
8 |
+
10,
|
9 |
+
12,
|
10 |
+
14,
|
11 |
+
16,
|
12 |
+
18,
|
13 |
+
20,
|
14 |
+
24,
|
15 |
+
28,
|
16 |
+
30,
|
17 |
+
32,
|
18 |
+
34,
|
19 |
+
36,
|
20 |
+
39,
|
21 |
+
41,
|
22 |
+
42,
|
23 |
+
43,
|
24 |
+
45,
|
25 |
+
47,
|
26 |
+
49,
|
27 |
+
51,
|
28 |
+
53,
|
29 |
+
55,
|
30 |
+
57,
|
31 |
+
61,
|
32 |
+
65,
|
33 |
+
67,
|
34 |
+
71,
|
35 |
+
73,
|
36 |
+
75,
|
37 |
+
77,
|
38 |
+
79,
|
39 |
+
81,
|
40 |
+
83,
|
41 |
+
85,
|
42 |
+
87,
|
43 |
+
89,
|
44 |
+
91,
|
45 |
+
93,
|
46 |
+
95,
|
47 |
+
99,
|
48 |
+
101,
|
49 |
+
102,
|
50 |
+
104,
|
51 |
+
107,
|
52 |
+
111,
|
53 |
+
113,
|
54 |
+
116,
|
55 |
+
118,
|
56 |
+
121,
|
57 |
+
124,
|
58 |
+
127,
|
59 |
+
128,
|
60 |
+
131,
|
61 |
+
133,
|
62 |
+
134,
|
63 |
+
136,
|
64 |
+
138,
|
65 |
+
140,
|
66 |
+
142,
|
67 |
+
144,
|
68 |
+
145,
|
69 |
+
147,
|
70 |
+
148,
|
71 |
+
149,
|
72 |
+
153,
|
73 |
+
155,
|
74 |
+
158,
|
75 |
+
161,
|
76 |
+
164,
|
77 |
+
166,
|
78 |
+
168,
|
79 |
+
170,
|
80 |
+
172,
|
81 |
+
174,
|
82 |
+
175,
|
83 |
+
177,
|
84 |
+
179,
|
85 |
+
182,
|
86 |
+
184,
|
87 |
+
186,
|
88 |
+
188,
|
89 |
+
190,
|
90 |
+
192,
|
91 |
+
194,
|
92 |
+
196,
|
93 |
+
199,
|
94 |
+
201,
|
95 |
+
203,
|
96 |
+
204,
|
97 |
+
207,
|
98 |
+
208,
|
99 |
+
209,
|
100 |
+
211,
|
101 |
+
213,
|
102 |
+
214,
|
103 |
+
216,
|
104 |
+
218,
|
105 |
+
220,
|
106 |
+
222,
|
107 |
+
224,
|
108 |
+
226,
|
109 |
+
229,
|
110 |
+
231,
|
111 |
+
232,
|
112 |
+
233,
|
113 |
+
235,
|
114 |
+
236,
|
115 |
+
238,
|
116 |
+
239,
|
117 |
+
243,
|
118 |
+
249,
|
119 |
+
253,
|
120 |
+
255,
|
121 |
+
257,
|
122 |
+
259,
|
123 |
+
261,
|
124 |
+
262,
|
125 |
+
263,
|
126 |
+
264,
|
127 |
+
267,
|
128 |
+
269,
|
129 |
+
270,
|
130 |
+
272,
|
131 |
+
274,
|
132 |
+
276,
|
133 |
+
278,
|
134 |
+
280,
|
135 |
+
282,
|
136 |
+
284,
|
137 |
+
286,
|
138 |
+
290,
|
139 |
+
291,
|
140 |
+
293,
|
141 |
+
295,
|
142 |
+
297,
|
143 |
+
299,
|
144 |
+
300,
|
145 |
+
302,
|
146 |
+
303,
|
147 |
+
304,
|
148 |
+
306,
|
149 |
+
308,
|
150 |
+
311,
|
151 |
+
314,
|
152 |
+
315,
|
153 |
+
317,
|
154 |
+
320,
|
155 |
+
321,
|
156 |
+
323,
|
157 |
+
324,
|
158 |
+
326,
|
159 |
+
327,
|
160 |
+
328,
|
161 |
+
330,
|
162 |
+
331,
|
163 |
+
333,
|
164 |
+
337,
|
165 |
+
339,
|
166 |
+
340,
|
167 |
+
344,
|
168 |
+
346,
|
169 |
+
350,
|
170 |
+
353,
|
171 |
+
354,
|
172 |
+
355,
|
173 |
+
358,
|
174 |
+
360,
|
175 |
+
361,
|
176 |
+
363,
|
177 |
+
365,
|
178 |
+
366,
|
179 |
+
369,
|
180 |
+
372,
|
181 |
+
373,
|
182 |
+
376,
|
183 |
+
380,
|
184 |
+
382,
|
185 |
+
383,
|
186 |
+
384,
|
187 |
+
386,
|
188 |
+
387,
|
189 |
+
389,
|
190 |
+
391,
|
191 |
+
392,
|
192 |
+
394,
|
193 |
+
395,
|
194 |
+
398,
|
195 |
+
400,
|
196 |
+
402,
|
197 |
+
404,
|
198 |
+
406,
|
199 |
+
409,
|
200 |
+
411,
|
201 |
+
412,
|
202 |
+
413,
|
203 |
+
415,
|
204 |
+
417,
|
205 |
+
420,
|
206 |
+
421,
|
207 |
+
423,
|
208 |
+
424,
|
209 |
+
425,
|
210 |
+
427,
|
211 |
+
429,
|
212 |
+
431,
|
213 |
+
433,
|
214 |
+
434,
|
215 |
+
436,
|
216 |
+
437,
|
217 |
+
439,
|
218 |
+
440,
|
219 |
+
442,
|
220 |
+
444,
|
221 |
+
445,
|
222 |
+
449,
|
223 |
+
450,
|
224 |
+
452,
|
225 |
+
454,
|
226 |
+
457,
|
227 |
+
459,
|
228 |
+
462,
|
229 |
+
465,
|
230 |
+
466,
|
231 |
+
468,
|
232 |
+
470,
|
233 |
+
471,
|
234 |
+
474,
|
235 |
+
475,
|
236 |
+
478,
|
237 |
+
480,
|
238 |
+
483,
|
239 |
+
485,
|
240 |
+
486,
|
241 |
+
487,
|
242 |
+
489,
|
243 |
+
491,
|
244 |
+
492,
|
245 |
+
493,
|
246 |
+
495,
|
247 |
+
496,
|
248 |
+
498,
|
249 |
+
500,
|
250 |
+
501,
|
251 |
+
502,
|
252 |
+
503,
|
253 |
+
504,
|
254 |
+
505,
|
255 |
+
507,
|
256 |
+
508,
|
257 |
+
509,
|
258 |
+
510,
|
259 |
+
511,
|
260 |
+
512,
|
261 |
+
514,
|
262 |
+
515,
|
263 |
+
516,
|
264 |
+
518,
|
265 |
+
519,
|
266 |
+
520,
|
267 |
+
521,
|
268 |
+
523,
|
269 |
+
525,
|
270 |
+
526,
|
271 |
+
528,
|
272 |
+
531,
|
273 |
+
533,
|
274 |
+
535,
|
275 |
+
453,
|
276 |
+
536,
|
277 |
+
538,
|
278 |
+
539,
|
279 |
+
541,
|
280 |
+
545,
|
281 |
+
547,
|
282 |
+
548,
|
283 |
+
549,
|
284 |
+
550,
|
285 |
+
551,
|
286 |
+
553,
|
287 |
+
554,
|
288 |
+
555,
|
289 |
+
557,
|
290 |
+
559,
|
291 |
+
560,
|
292 |
+
561,
|
293 |
+
563,
|
294 |
+
565,
|
295 |
+
566,
|
296 |
+
567,
|
297 |
+
568,
|
298 |
+
570,
|
299 |
+
571,
|
300 |
+
575,
|
301 |
+
577,
|
302 |
+
578,
|
303 |
+
579,
|
304 |
+
582,
|
305 |
+
585,
|
306 |
+
587,
|
307 |
+
589,
|
308 |
+
593,
|
309 |
+
594,
|
310 |
+
596,
|
311 |
+
597,
|
312 |
+
601,
|
313 |
+
603,
|
314 |
+
605,
|
315 |
+
609,
|
316 |
+
611,
|
317 |
+
612,
|
318 |
+
613,
|
319 |
+
614,
|
320 |
+
615,
|
321 |
+
616,
|
322 |
+
617,
|
323 |
+
619,
|
324 |
+
621,
|
325 |
+
622,
|
326 |
+
624,
|
327 |
+
625,
|
328 |
+
627,
|
329 |
+
628,
|
330 |
+
629,
|
331 |
+
632,
|
332 |
+
634,
|
333 |
+
638,
|
334 |
+
639,
|
335 |
+
640,
|
336 |
+
642,
|
337 |
+
644,
|
338 |
+
647,
|
339 |
+
649,
|
340 |
+
650,
|
341 |
+
651,
|
342 |
+
653,
|
343 |
+
654,
|
344 |
+
655,
|
345 |
+
657,
|
346 |
+
658,
|
347 |
+
659,
|
348 |
+
661,
|
349 |
+
663,
|
350 |
+
665,
|
351 |
+
667,
|
352 |
+
669,
|
353 |
+
670,
|
354 |
+
672,
|
355 |
+
674,
|
356 |
+
676,
|
357 |
+
677,
|
358 |
+
678,
|
359 |
+
680,
|
360 |
+
682,
|
361 |
+
683,
|
362 |
+
685,
|
363 |
+
686,
|
364 |
+
688,
|
365 |
+
689,
|
366 |
+
690,
|
367 |
+
691,
|
368 |
+
694,
|
369 |
+
695,
|
370 |
+
696,
|
371 |
+
697,
|
372 |
+
699,
|
373 |
+
700,
|
374 |
+
701,
|
375 |
+
703,
|
376 |
+
705,
|
377 |
+
706,
|
378 |
+
707,
|
379 |
+
708,
|
380 |
+
712,
|
381 |
+
715,
|
382 |
+
716,
|
383 |
+
718,
|
384 |
+
720,
|
385 |
+
724,
|
386 |
+
726,
|
387 |
+
729,
|
388 |
+
730,
|
389 |
+
732,
|
390 |
+
733,
|
391 |
+
734,
|
392 |
+
736,
|
393 |
+
738,
|
394 |
+
739,
|
395 |
+
740,
|
396 |
+
741,
|
397 |
+
742,
|
398 |
+
743,
|
399 |
+
744,
|
400 |
+
747,
|
401 |
+
749,
|
402 |
+
753,
|
403 |
+
756,
|
404 |
+
758,
|
405 |
+
759,
|
406 |
+
761,
|
407 |
+
762,
|
408 |
+
763,
|
409 |
+
764,
|
410 |
+
766,
|
411 |
+
768,
|
412 |
+
769,
|
413 |
+
771,
|
414 |
+
773,
|
415 |
+
774,
|
416 |
+
775,
|
417 |
+
776,
|
418 |
+
777,
|
419 |
+
781,
|
420 |
+
783,
|
421 |
+
784,
|
422 |
+
785,
|
423 |
+
788,
|
424 |
+
791,
|
425 |
+
792,
|
426 |
+
794,
|
427 |
+
796,
|
428 |
+
797,
|
429 |
+
798,
|
430 |
+
799,
|
431 |
+
800,
|
432 |
+
802,
|
433 |
+
803,
|
434 |
+
804,
|
435 |
+
805,
|
436 |
+
806,
|
437 |
+
808,
|
438 |
+
809,
|
439 |
+
810,
|
440 |
+
811,
|
441 |
+
812,
|
442 |
+
814,
|
443 |
+
815,
|
444 |
+
817,
|
445 |
+
819,
|
446 |
+
820,
|
447 |
+
822,
|
448 |
+
824,
|
449 |
+
825,
|
450 |
+
827,
|
451 |
+
829,
|
452 |
+
831,
|
453 |
+
833,
|
454 |
+
835,
|
455 |
+
837
|
456 |
+
]
|
457 |
+
}
|
lemmatizer/{lookups/lookups.bin → model}
RENAMED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:fd1961075ded3bc09a5a58ead51adad20e36d70d2a099362fe21386796b1521e
|
3 |
+
size 176206
|
lemmatizer/trees
ADDED
Binary file (89.9 kB). View file
|
|
meta.json
CHANGED
@@ -1,14 +1,14 @@
|
|
1 |
{
|
2 |
"lang":"da",
|
3 |
"name":"core_news_lg",
|
4 |
-
"version":"3.
|
5 |
-
"description":"Danish pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler
|
6 |
"author":"Explosion",
|
7 |
"email":"[email protected]",
|
8 |
"url":"https://explosion.ai",
|
9 |
"license":"CC BY-SA 4.0",
|
10 |
-
"spacy_version":">=3.
|
11 |
-
"spacy_git_version":"
|
12 |
"vectors":{
|
13 |
"width":300,
|
14 |
"vectors":500000,
|
@@ -212,15 +212,8 @@
|
|
212 |
"punct",
|
213 |
"xcomp"
|
214 |
],
|
215 |
-
"senter":[
|
216 |
-
"I",
|
217 |
-
"S"
|
218 |
-
],
|
219 |
"attribute_ruler":[
|
220 |
|
221 |
-
],
|
222 |
-
"lemmatizer":[
|
223 |
-
|
224 |
],
|
225 |
"ner":[
|
226 |
"LOC",
|
@@ -233,17 +226,17 @@
|
|
233 |
"tok2vec",
|
234 |
"morphologizer",
|
235 |
"parser",
|
236 |
-
"attribute_ruler",
|
237 |
"lemmatizer",
|
|
|
238 |
"ner"
|
239 |
],
|
240 |
"components":[
|
241 |
"tok2vec",
|
242 |
"morphologizer",
|
243 |
"parser",
|
|
|
244 |
"senter",
|
245 |
"attribute_ruler",
|
246 |
-
"lemmatizer",
|
247 |
"ner"
|
248 |
],
|
249 |
"disabled":[
|
@@ -254,81 +247,81 @@
|
|
254 |
"token_p":0.9977732598,
|
255 |
"token_r":0.9974835463,
|
256 |
"token_f":0.997628382,
|
257 |
-
"pos_acc":0.
|
258 |
-
"morph_acc":0.
|
259 |
-
"morph_micro_p":0.
|
260 |
-
"morph_micro_r":0.
|
261 |
-
"morph_micro_f":0.
|
262 |
"morph_per_feat":{
|
263 |
"Mood":{
|
264 |
-
"p":0.
|
265 |
-
"r":0.
|
266 |
-
"f":0.
|
267 |
},
|
268 |
"Tense":{
|
269 |
-
"p":0.
|
270 |
-
"r":0.
|
271 |
-
"f":0.
|
272 |
},
|
273 |
"VerbForm":{
|
274 |
-
"p":0.
|
275 |
-
"r":0.
|
276 |
-
"f":0.
|
277 |
},
|
278 |
"Voice":{
|
279 |
-
"p":0.
|
280 |
-
"r":0.
|
281 |
-
"f":0.
|
282 |
},
|
283 |
"Definite":{
|
284 |
-
"p":0.
|
285 |
-
"r":0.
|
286 |
-
"f":0.
|
287 |
},
|
288 |
"Gender":{
|
289 |
-
"p":0.
|
290 |
-
"r":0.
|
291 |
-
"f":0.
|
292 |
},
|
293 |
"Number":{
|
294 |
-
"p":0.
|
295 |
-
"r":0.
|
296 |
-
"f":0.
|
297 |
},
|
298 |
"AdpType":{
|
299 |
-
"p":0.
|
300 |
-
"r":0.
|
301 |
-
"f":0.
|
302 |
},
|
303 |
"PartType":{
|
304 |
-
"p":
|
305 |
"r":1.0,
|
306 |
-
"f":
|
307 |
},
|
308 |
"Case":{
|
309 |
-
"p":0.
|
310 |
-
"r":0.
|
311 |
-
"f":0.
|
312 |
},
|
313 |
"Person":{
|
314 |
-
"p":0.
|
315 |
-
"r":0.
|
316 |
-
"f":0.
|
317 |
},
|
318 |
"PronType":{
|
319 |
-
"p":0.
|
320 |
-
"r":0.
|
321 |
-
"f":0.
|
322 |
},
|
323 |
"NumType":{
|
324 |
-
"p":0.
|
325 |
"r":0.9602649007,
|
326 |
-
"f":0.
|
327 |
},
|
328 |
"Degree":{
|
329 |
-
"p":0.
|
330 |
-
"r":0.
|
331 |
-
"f":0.
|
332 |
},
|
333 |
"Reflex":{
|
334 |
"p":1.0,
|
@@ -336,24 +329,24 @@
|
|
336 |
"f":1.0
|
337 |
},
|
338 |
"Number[psor]":{
|
339 |
-
"p":0.
|
340 |
"r":1.0,
|
341 |
-
"f":0.
|
342 |
},
|
343 |
"Poss":{
|
344 |
-
"p":
|
345 |
"r":1.0,
|
346 |
-
"f":
|
347 |
},
|
348 |
"Foreign":{
|
349 |
-
"p":0.
|
350 |
-
"r":0.
|
351 |
-
"f":0.
|
352 |
},
|
353 |
"Abbr":{
|
354 |
-
"p":
|
355 |
-
"r":0.
|
356 |
-
"f":0.
|
357 |
},
|
358 |
"Style":{
|
359 |
"p":1.0,
|
@@ -361,146 +354,151 @@
|
|
361 |
"f":1.0
|
362 |
},
|
363 |
"Polite":{
|
364 |
-
"p":
|
365 |
-
"r":0.
|
366 |
-
"f":0.
|
367 |
}
|
368 |
},
|
369 |
-
"sents_p":0.
|
370 |
-
"sents_r":0.
|
371 |
-
"sents_f":0.
|
372 |
-
"dep_uas":0.
|
373 |
-
"dep_las":0.
|
374 |
"dep_las_per_type":{
|
375 |
"advmod":{
|
376 |
-
"p":0.
|
377 |
-
"r":0.
|
378 |
-
"f":0.
|
379 |
},
|
380 |
"root":{
|
381 |
-
"p":0.
|
382 |
-
"r":0.
|
383 |
-
"f":0.
|
384 |
},
|
385 |
"nsubj":{
|
386 |
-
"p":0.
|
387 |
-
"r":0.
|
388 |
-
"f":0.
|
389 |
},
|
390 |
"case":{
|
391 |
-
"p":0.
|
392 |
-
"r":0.
|
393 |
-
"f":0.
|
394 |
},
|
395 |
"obl":{
|
396 |
-
"p":0.
|
397 |
"r":0.6739130435,
|
398 |
-
"f":0.
|
399 |
},
|
400 |
"cc":{
|
401 |
-
"p":0.
|
402 |
-
"r":0.
|
403 |
-
"f":0.
|
404 |
},
|
405 |
"conj":{
|
406 |
-
"p":0.
|
407 |
-
"r":0.
|
408 |
-
"f":0.
|
409 |
},
|
410 |
"obj":{
|
411 |
-
"p":0.
|
412 |
-
"r":0.
|
413 |
-
"f":0.
|
414 |
},
|
415 |
"aux":{
|
416 |
-
"p":0.
|
417 |
-
"r":0.
|
418 |
-
"f":0.
|
419 |
},
|
420 |
"acl:relcl":{
|
421 |
-
"p":0.
|
422 |
-
"r":0.
|
423 |
-
"f":0.
|
424 |
},
|
425 |
"advmod:lmod":{
|
426 |
-
"p":0.
|
427 |
-
"r":0.
|
428 |
-
"f":0.
|
429 |
},
|
430 |
"det":{
|
431 |
-
"p":0.
|
432 |
-
"r":0.
|
433 |
-
"f":0.
|
434 |
},
|
435 |
"amod":{
|
436 |
-
"p":0.
|
437 |
-
"r":0.
|
438 |
-
"f":0.
|
439 |
},
|
440 |
"nmod:poss":{
|
441 |
-
"p":0.
|
442 |
-
"r":0.
|
443 |
-
"f":0.
|
444 |
},
|
445 |
"ccomp":{
|
446 |
-
"p":0.
|
447 |
-
"r":0.
|
448 |
-
"f":0.
|
449 |
},
|
450 |
"nummod":{
|
451 |
-
"p":0.
|
452 |
-
"r":0.
|
453 |
-
"f":0.
|
454 |
},
|
455 |
"flat":{
|
456 |
-
"p":0.
|
457 |
-
"r":0.
|
458 |
-
"f":0.
|
459 |
},
|
460 |
"compound:prt":{
|
461 |
-
"p":0.
|
462 |
-
"r":0.
|
463 |
-
"f":0.
|
464 |
},
|
465 |
"advcl":{
|
466 |
-
"p":0.
|
467 |
-
"r":0.
|
468 |
-
"f":0.
|
469 |
},
|
470 |
"mark":{
|
471 |
-
"p":0.
|
472 |
-
"r":0.
|
473 |
-
"f":0.
|
474 |
},
|
475 |
"cop":{
|
476 |
-
"p":0.
|
477 |
-
"r":0.
|
478 |
-
"f":0.
|
479 |
},
|
480 |
"dep":{
|
481 |
-
"p":0.
|
482 |
-
"r":0.
|
483 |
-
"f":0.
|
484 |
},
|
485 |
"nmod":{
|
486 |
-
"p":0.
|
487 |
-
"r":0.
|
488 |
-
"f":0.
|
489 |
},
|
490 |
"iobj":{
|
491 |
-
"p":0.
|
492 |
-
"r":0.
|
493 |
-
"f":0.
|
494 |
},
|
495 |
"xcomp":{
|
496 |
-
"p":0.
|
497 |
-
"r":0.
|
498 |
-
"f":0.
|
|
|
|
|
|
|
|
|
|
|
499 |
},
|
500 |
"list":{
|
501 |
-
"p":0.
|
502 |
"r":0.3333333333,
|
503 |
-
"f":0.
|
504 |
},
|
505 |
"vocative":{
|
506 |
"p":0.0,
|
@@ -508,64 +506,59 @@
|
|
508 |
"f":0.0
|
509 |
},
|
510 |
"fixed":{
|
511 |
-
"p":0.
|
512 |
-
"r":0.
|
513 |
-
"f":0.
|
514 |
},
|
515 |
-
"
|
516 |
-
"p":0.
|
517 |
-
"r":0.
|
518 |
-
"f":0.
|
519 |
},
|
520 |
-
"
|
521 |
-
"p":0.
|
522 |
-
"r":0.
|
523 |
-
"f":0.
|
524 |
},
|
525 |
"obl:tmod":{
|
526 |
-
"p":0.
|
527 |
-
"r":0.
|
528 |
-
"f":0.
|
529 |
},
|
530 |
"discourse":{
|
531 |
"p":0.0,
|
532 |
"r":0.0,
|
533 |
"f":0.0
|
534 |
-
},
|
535 |
-
"obl:lmod":{
|
536 |
-
"p":0.0,
|
537 |
-
"r":0.0,
|
538 |
-
"f":0.0
|
539 |
}
|
540 |
},
|
541 |
-
"
|
542 |
-
"
|
543 |
-
"ents_p":0.
|
544 |
-
"ents_r":0.
|
545 |
-
"ents_f":0.
|
546 |
"ents_per_type":{
|
547 |
"PER":{
|
548 |
-
"p":0.
|
549 |
-
"r":0.
|
550 |
-
"f":0.
|
551 |
},
|
552 |
"ORG":{
|
553 |
-
"p":0.
|
554 |
-
"r":0.
|
555 |
-
"f":0.
|
556 |
},
|
557 |
"MISC":{
|
558 |
-
"p":0.
|
559 |
-
"r":0.
|
560 |
-
"f":0.
|
561 |
},
|
562 |
"LOC":{
|
563 |
-
"p":0.
|
564 |
-
"r":0.
|
565 |
-
"f":0.
|
566 |
}
|
567 |
},
|
568 |
-
"speed":
|
569 |
},
|
570 |
"sources":[
|
571 |
{
|
@@ -580,12 +573,6 @@
|
|
580 |
"license":"CC BY-SA 4.0",
|
581 |
"author":"Rasmus Hvingelby, Amalie B. Pauli, Maria Barrett, Christina Rosted, Lasse M. Lidegaard, Anders S\u00f8gaard"
|
582 |
},
|
583 |
-
{
|
584 |
-
"name":"Lemmatization Lists",
|
585 |
-
"url":"https://github.com/michmech/lemmatization-lists/",
|
586 |
-
"license":"ODbL",
|
587 |
-
"author":"Michal M\u011bchura"
|
588 |
-
},
|
589 |
{
|
590 |
"name":"Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia)",
|
591 |
"url":"https://spacy.io",
|
|
|
1 |
{
|
2 |
"lang":"da",
|
3 |
"name":"core_news_lg",
|
4 |
+
"version":"3.3.0",
|
5 |
+
"description":"Danish pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, lemmatizer (trainable_lemmatizer), senter, ner, attribute_ruler.",
|
6 |
"author":"Explosion",
|
7 |
"email":"[email protected]",
|
8 |
"url":"https://explosion.ai",
|
9 |
"license":"CC BY-SA 4.0",
|
10 |
+
"spacy_version":">=3.3.0.dev0,<3.4.0",
|
11 |
+
"spacy_git_version":"849bef2de",
|
12 |
"vectors":{
|
13 |
"width":300,
|
14 |
"vectors":500000,
|
|
|
212 |
"punct",
|
213 |
"xcomp"
|
214 |
],
|
|
|
|
|
|
|
|
|
215 |
"attribute_ruler":[
|
216 |
|
|
|
|
|
|
|
217 |
],
|
218 |
"ner":[
|
219 |
"LOC",
|
|
|
226 |
"tok2vec",
|
227 |
"morphologizer",
|
228 |
"parser",
|
|
|
229 |
"lemmatizer",
|
230 |
+
"attribute_ruler",
|
231 |
"ner"
|
232 |
],
|
233 |
"components":[
|
234 |
"tok2vec",
|
235 |
"morphologizer",
|
236 |
"parser",
|
237 |
+
"lemmatizer",
|
238 |
"senter",
|
239 |
"attribute_ruler",
|
|
|
240 |
"ner"
|
241 |
],
|
242 |
"disabled":[
|
|
|
247 |
"token_p":0.9977732598,
|
248 |
"token_r":0.9974835463,
|
249 |
"token_f":0.997628382,
|
250 |
+
"pos_acc":0.9633898305,
|
251 |
+
"morph_acc":0.9568038741,
|
252 |
+
"morph_micro_p":0.9727434528,
|
253 |
+
"morph_micro_r":0.9655746807,
|
254 |
+
"morph_micro_f":0.9691458101,
|
255 |
"morph_per_feat":{
|
256 |
"Mood":{
|
257 |
+
"p":0.9799043062,
|
258 |
+
"r":0.9761677788,
|
259 |
+
"f":0.9780324737
|
260 |
},
|
261 |
"Tense":{
|
262 |
+
"p":0.9772727273,
|
263 |
+
"r":0.9713855422,
|
264 |
+
"f":0.9743202417
|
265 |
},
|
266 |
"VerbForm":{
|
267 |
+
"p":0.9686153846,
|
268 |
+
"r":0.9632802938,
|
269 |
+
"f":0.9659404725
|
270 |
},
|
271 |
"Voice":{
|
272 |
+
"p":0.9798206278,
|
273 |
+
"r":0.9798206278,
|
274 |
+
"f":0.9798206278
|
275 |
},
|
276 |
"Definite":{
|
277 |
+
"p":0.968812475,
|
278 |
+
"r":0.9573291189,
|
279 |
+
"f":0.963036566
|
280 |
},
|
281 |
"Gender":{
|
282 |
+
"p":0.9597720416,
|
283 |
+
"r":0.9514788966,
|
284 |
+
"f":0.9556074766
|
285 |
},
|
286 |
"Number":{
|
287 |
+
"p":0.9683961022,
|
288 |
+
"r":0.9590505999,
|
289 |
+
"f":0.9637006945
|
290 |
},
|
291 |
"AdpType":{
|
292 |
+
"p":0.9982206406,
|
293 |
+
"r":0.9920424403,
|
294 |
+
"f":0.9951219512
|
295 |
},
|
296 |
"PartType":{
|
297 |
+
"p":0.996763754,
|
298 |
"r":1.0,
|
299 |
+
"f":0.9983792545
|
300 |
},
|
301 |
"Case":{
|
302 |
+
"p":0.9806451613,
|
303 |
+
"r":0.9605055292,
|
304 |
+
"f":0.9704708699
|
305 |
},
|
306 |
"Person":{
|
307 |
+
"p":0.9804270463,
|
308 |
+
"r":0.9786856128,
|
309 |
+
"f":0.9795555556
|
310 |
},
|
311 |
"PronType":{
|
312 |
+
"p":0.9835390947,
|
313 |
+
"r":0.9827302632,
|
314 |
+
"f":0.9831345125
|
315 |
},
|
316 |
"NumType":{
|
317 |
+
"p":0.9931506849,
|
318 |
"r":0.9602649007,
|
319 |
+
"f":0.9764309764
|
320 |
},
|
321 |
"Degree":{
|
322 |
+
"p":0.9578313253,
|
323 |
+
"r":0.9578313253,
|
324 |
+
"f":0.9578313253
|
325 |
},
|
326 |
"Reflex":{
|
327 |
"p":1.0,
|
|
|
329 |
"f":1.0
|
330 |
},
|
331 |
"Number[psor]":{
|
332 |
+
"p":0.9772727273,
|
333 |
"r":1.0,
|
334 |
+
"f":0.9885057471
|
335 |
},
|
336 |
"Poss":{
|
337 |
+
"p":0.9887640449,
|
338 |
"r":1.0,
|
339 |
+
"f":0.9943502825
|
340 |
},
|
341 |
"Foreign":{
|
342 |
+
"p":0.6,
|
343 |
+
"r":0.3,
|
344 |
+
"f":0.4
|
345 |
},
|
346 |
"Abbr":{
|
347 |
+
"p":0.0,
|
348 |
+
"r":0.0,
|
349 |
+
"f":0.0
|
350 |
},
|
351 |
"Style":{
|
352 |
"p":1.0,
|
|
|
354 |
"f":1.0
|
355 |
},
|
356 |
"Polite":{
|
357 |
+
"p":0.75,
|
358 |
+
"r":0.75,
|
359 |
+
"f":0.75
|
360 |
}
|
361 |
},
|
362 |
+
"sents_p":0.9103942652,
|
363 |
+
"sents_r":0.9007092199,
|
364 |
+
"sents_f":0.9055258467,
|
365 |
+
"dep_uas":0.8195787003,
|
366 |
+
"dep_las":0.7807576266,
|
367 |
"dep_las_per_type":{
|
368 |
"advmod":{
|
369 |
+
"p":0.6955345061,
|
370 |
+
"r":0.7259887006,
|
371 |
+
"f":0.7104353836
|
372 |
},
|
373 |
"root":{
|
374 |
+
"p":0.824686941,
|
375 |
+
"r":0.8173758865,
|
376 |
+
"f":0.821015138
|
377 |
},
|
378 |
"nsubj":{
|
379 |
+
"p":0.8361884368,
|
380 |
+
"r":0.8238396624,
|
381 |
+
"f":0.829968119
|
382 |
},
|
383 |
"case":{
|
384 |
+
"p":0.9003984064,
|
385 |
+
"r":0.8915187377,
|
386 |
+
"f":0.8959365709
|
387 |
},
|
388 |
"obl":{
|
389 |
+
"p":0.7221297837,
|
390 |
"r":0.6739130435,
|
391 |
+
"f":0.697188755
|
392 |
},
|
393 |
"cc":{
|
394 |
+
"p":0.7630057803,
|
395 |
+
"r":0.7674418605,
|
396 |
+
"f":0.7652173913
|
397 |
},
|
398 |
"conj":{
|
399 |
+
"p":0.6106442577,
|
400 |
+
"r":0.5813333333,
|
401 |
+
"f":0.5956284153
|
402 |
},
|
403 |
"obj":{
|
404 |
+
"p":0.7893772894,
|
405 |
+
"r":0.8368932039,
|
406 |
+
"f":0.8124410933
|
407 |
},
|
408 |
"aux":{
|
409 |
+
"p":0.8764705882,
|
410 |
+
"r":0.8688046647,
|
411 |
+
"f":0.8726207906
|
412 |
},
|
413 |
"acl:relcl":{
|
414 |
+
"p":0.6300578035,
|
415 |
+
"r":0.5891891892,
|
416 |
+
"f":0.6089385475
|
417 |
},
|
418 |
"advmod:lmod":{
|
419 |
+
"p":0.7272727273,
|
420 |
+
"r":0.7164179104,
|
421 |
+
"f":0.7218045113
|
422 |
},
|
423 |
"det":{
|
424 |
+
"p":0.9140495868,
|
425 |
+
"r":0.9110378913,
|
426 |
+
"f":0.9125412541
|
427 |
},
|
428 |
"amod":{
|
429 |
+
"p":0.8080645161,
|
430 |
+
"r":0.8549488055,
|
431 |
+
"f":0.8308457711
|
432 |
},
|
433 |
"nmod:poss":{
|
434 |
+
"p":0.7373737374,
|
435 |
+
"r":0.7227722772,
|
436 |
+
"f":0.73
|
437 |
},
|
438 |
"ccomp":{
|
439 |
+
"p":0.7068965517,
|
440 |
+
"r":0.6612903226,
|
441 |
+
"f":0.6833333333
|
442 |
},
|
443 |
"nummod":{
|
444 |
+
"p":0.8360655738,
|
445 |
+
"r":0.85,
|
446 |
+
"f":0.8429752066
|
447 |
},
|
448 |
"flat":{
|
449 |
+
"p":0.7844311377,
|
450 |
+
"r":0.8675496689,
|
451 |
+
"f":0.8238993711
|
452 |
},
|
453 |
"compound:prt":{
|
454 |
+
"p":0.5,
|
455 |
+
"r":0.2926829268,
|
456 |
+
"f":0.3692307692
|
457 |
},
|
458 |
"advcl":{
|
459 |
+
"p":0.6545454545,
|
460 |
+
"r":0.6206896552,
|
461 |
+
"f":0.6371681416
|
462 |
},
|
463 |
"mark":{
|
464 |
+
"p":0.8781512605,
|
465 |
+
"r":0.8583162218,
|
466 |
+
"f":0.8681204569
|
467 |
},
|
468 |
"cop":{
|
469 |
+
"p":0.8121546961,
|
470 |
+
"r":0.84,
|
471 |
+
"f":0.8258426966
|
472 |
},
|
473 |
"dep":{
|
474 |
+
"p":0.145631068,
|
475 |
+
"r":0.2830188679,
|
476 |
+
"f":0.1923076923
|
477 |
},
|
478 |
"nmod":{
|
479 |
+
"p":0.6549707602,
|
480 |
+
"r":0.65625,
|
481 |
+
"f":0.6556097561
|
482 |
},
|
483 |
"iobj":{
|
484 |
+
"p":0.8125,
|
485 |
+
"r":0.5909090909,
|
486 |
+
"f":0.6842105263
|
487 |
},
|
488 |
"xcomp":{
|
489 |
+
"p":0.4772727273,
|
490 |
+
"r":0.3559322034,
|
491 |
+
"f":0.4077669903
|
492 |
+
},
|
493 |
+
"appos":{
|
494 |
+
"p":0.5384615385,
|
495 |
+
"r":0.4242424242,
|
496 |
+
"f":0.4745762712
|
497 |
},
|
498 |
"list":{
|
499 |
+
"p":0.5,
|
500 |
"r":0.3333333333,
|
501 |
+
"f":0.4
|
502 |
},
|
503 |
"vocative":{
|
504 |
"p":0.0,
|
|
|
506 |
"f":0.0
|
507 |
},
|
508 |
"fixed":{
|
509 |
+
"p":0.8717948718,
|
510 |
+
"r":0.8292682927,
|
511 |
+
"f":0.85
|
512 |
},
|
513 |
+
"obl:lmod":{
|
514 |
+
"p":0.0,
|
515 |
+
"r":0.0,
|
516 |
+
"f":0.0
|
517 |
},
|
518 |
+
"expl":{
|
519 |
+
"p":0.8529411765,
|
520 |
+
"r":0.8529411765,
|
521 |
+
"f":0.8529411765
|
522 |
},
|
523 |
"obl:tmod":{
|
524 |
+
"p":0.6363636364,
|
525 |
+
"r":0.3888888889,
|
526 |
+
"f":0.4827586207
|
527 |
},
|
528 |
"discourse":{
|
529 |
"p":0.0,
|
530 |
"r":0.0,
|
531 |
"f":0.0
|
|
|
|
|
|
|
|
|
|
|
532 |
}
|
533 |
},
|
534 |
+
"lemma_acc":0.9516707022,
|
535 |
+
"tag_acc":0.9633898305,
|
536 |
+
"ents_p":0.8183716075,
|
537 |
+
"ents_r":0.8166666667,
|
538 |
+
"ents_f":0.8175182482,
|
539 |
"ents_per_type":{
|
540 |
"PER":{
|
541 |
+
"p":0.8993710692,
|
542 |
+
"r":0.8614457831,
|
543 |
+
"f":0.88
|
544 |
},
|
545 |
"ORG":{
|
546 |
+
"p":0.7303370787,
|
547 |
+
"r":0.7222222222,
|
548 |
+
"f":0.7262569832
|
549 |
},
|
550 |
"MISC":{
|
551 |
+
"p":0.7288135593,
|
552 |
+
"r":0.7610619469,
|
553 |
+
"f":0.7445887446
|
554 |
},
|
555 |
"LOC":{
|
556 |
+
"p":0.8672566372,
|
557 |
+
"r":0.8828828829,
|
558 |
+
"f":0.875
|
559 |
}
|
560 |
},
|
561 |
+
"speed":10791.2692595094
|
562 |
},
|
563 |
"sources":[
|
564 |
{
|
|
|
573 |
"license":"CC BY-SA 4.0",
|
574 |
"author":"Rasmus Hvingelby, Amalie B. Pauli, Maria Barrett, Christina Rosted, Lasse M. Lidegaard, Anders S\u00f8gaard"
|
575 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
576 |
{
|
577 |
"name":"Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia)",
|
578 |
"url":"https://spacy.io",
|
morphologizer/model
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c9a904d06964b6afa205f053f74bc3b869bab70872d9265d38fadd867450df26
|
3 |
+
size 61351
|
ner/model
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:3d60d6db1813b3f2d8de3dea75ed89b12bf168bb820f9bda6630e5a51d4d1ecb
|
3 |
+
size 6496592
|
parser/model
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 308728
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c112c2427b1cfb0608eb1ef39d0206558de657c031eca80773d43e578f7517f8
|
3 |
size 308728
|
parser/moves
CHANGED
@@ -1 +1 @@
|
|
1 |
-
��moves�D{"0":{"":
|
|
|
1 |
+
��moves�D{"0":{"":41615},"1":{"":34382},"2":{"case":7526,"nsubj":6005,"det":4341,"amod":3967,"advmod":3662,"mark":3530,"aux":2436,"cc":2264,"punct":2187,"cop":1330,"obl":894,"nummod":834,"nmod:poss":656,"nmod":463,"expl":291,"ccomp":203,"obj":195,"xcomp":122,"case||nmod":73,"obl:tmod":53,"dep":48,"acl:relcl":43},"3":{"punct":8693,"obl":3951,"obj":3760,"nmod":3569,"conj":2747,"advmod":2087,"flat":1302,"nsubj":1169,"acl:relcl":1132,"advcl":809,"amod":622,"advmod:lmod":423,"fixed":390,"dep":322,"xcomp":272,"appos":268,"compound:prt":261,"ccomp":252,"acl:relcl||nsubj":237,"case":202,"nummod":168,"list":159,"nmod:poss":156,"punct||conj":151,"cc":135,"mark":133,"iobj":107,"expl":77,"cop":69,"nmod||case":60,"aux":48,"obl:tmod":45,"obl:lmod":44,"cc||case":43,"advcl||advmod":43,"cc||conj":40,"case||obl":38,"punct||case":33},"4":{"ROOT":4383}}�cfg��neg_key�
|
senter/model
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c5afa93a8f788243e6c02d5ab57762e55511fbfa00f89ee3c21bd75cea7ae6bc
|
3 |
+
size 219953
|
tok2vec/model
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:afa9aae853e4d60a66837dd127015d27d43c8f772b22c8c7b172238e5dfaa846
|
3 |
+
size 6365604
|
tokenizer
CHANGED
The diff for this file is too large to render.
See raw diff
|
|
vocab/key2row
CHANGED
Binary files a/vocab/key2row and b/vocab/key2row differ
|
|
vocab/strings.json
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:528d1a6bb62dc4d608b0ea1be75d557f41cdd76867460448bbbc174d34ae193a
|
3 |
+
size 10081139
|