Hi Team,
This is a report from Giskard Bot Scan 🐢.
We have identified 7 potential vulnerabilities in your model based on an automated scan.
This automated analysis evaluated the model on the dataset cardiffnlp/tweet_sentiment_multilingual (subset english
, split validation
).
👉Robustness issues (5)
When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 24.07% of the cases. We expected the predictions not to be affected by this transformation.
Level |
Data slice |
Metric |
Deviation |
major 🔴 |
— |
Fail rate = 0.241 |
78/324 tested samples (24.07%) changed prediction after perturbation |
Taxonomy
avid-effect:performance:P0201
🔍✨Examples
|
text |
Transform to uppercase(text) |
Original prediction |
Prediction after perturbation |
2 |
Hold on... Sam Smith may do the theme to Spectre!? Dope!!!!!! #007 #SPECTRE #JamesBond |
HOLD ON... SAM SMITH MAY DO THE THEME TO SPECTRE!? DOPE!!!!!! #007 #SPECTRE #JAMESBOND |
positive (p = 0.98) |
neutral (p = 0.77) |
4 |
Gonna watch Final Destination 5 tonight. I always leave the theater so afraid of everything. No huge escalators for sure :S |
GONNA WATCH FINAL DESTINATION 5 TONIGHT. I ALWAYS LEAVE THE THEATER SO AFRAID OF EVERYTHING. NO HUGE ESCALATORS FOR SURE :S |
positive (p = 0.96) |
negative (p = 0.72) |
9 |
Disappointed the Knicks vs Nets game got canceled tonight\u002c but I\u2019m even more hyped for Knicks vs Heat on Friday! |
DISAPPOINTED THE KNICKS VS NETS GAME GOT CANCELED TONIGHT\U002C BUT I\U2019M EVEN MORE HYPED FOR KNICKS VS HEAT ON FRIDAY! |
negative (p = 0.47) |
positive (p = 0.97) |
When feature “text” is perturbed with the transformation “Transform to title case”, the model changes its prediction in 18.52% of the cases. We expected the predictions not to be affected by this transformation.
Level |
Data slice |
Metric |
Deviation |
major 🔴 |
— |
Fail rate = 0.185 |
60/324 tested samples (18.52%) changed prediction after perturbation |
Taxonomy
avid-effect:performance:P0201
🔍✨Examples
|
text |
Transform to title case(text) |
Original prediction |
Prediction after perturbation |
0 |
@user
@user
I think after Charlie Hebdo the French did NOT react as the US did after 9/11. But they may do this time around. |
@User
@User
I Think After Charlie Hebdo The French Did Not React As The Us Did After 9/11. But They May Do This Time Around. |
negative (p = 0.50) |
neutral (p = 0.73) |
1 |
"Interview with Devon Alexander """"Speed Kills"""" (VIDEO) On Tuesday Oct 16th we had the privilege of catch up with... |
"Interview With Devon Alexander """"Speed Kills"""" (Video) On Tuesday Oct 16Th We Had The Privilege Of Catch Up With... |
neutral (p = 0.67) |
positive (p = 0.91) |
4 |
Gonna watch Final Destination 5 tonight. I always leave the theater so afraid of everything. No huge escalators for sure :S |
Gonna Watch Final Destination 5 Tonight. I Always Leave The Theater So Afraid Of Everything. No Huge Escalators For Sure :S |
positive (p = 0.96) |
negative (p = 0.39) |
When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 14.74% of the cases. We expected the predictions not to be affected by this transformation.
Level |
Data slice |
Metric |
Deviation |
major 🔴 |
— |
Fail rate = 0.147 |
46/312 tested samples (14.74%) changed prediction after perturbation |
Taxonomy
avid-effect:performance:P0201
🔍✨Examples
|
text |
Add typos(text) |
Original prediction |
Prediction after perturbation |
1 |
"Interview with Devon Alexander """"Speed Kills"""" (VIDEO) On Tuesday Oct 16th we had the privilege of catch up with... |
"Interview with Devon Alexadner """"Speed Kils"""" (VIDSO) On Tuesdxay Oct 16th we had the privilege of catch up with... |
neutral (p = 0.67) |
positive (p = 0.76) |
4 |
Gonna watch Final Destination 5 tonight. I always leave the theater so afraid of everything. No huge escalators for sure :S |
Gonna watvch Final Destination 5 tonihgt. U always leave rthe theater so afraid of everything. No huge escalators for sure :S |
positive (p = 0.96) |
negative (p = 0.54) |
11 |
"""""@_eryflores: March 16 Luke Bryan is gonna at the Houston Rodeo. I HAVE to go\u002c Its a MUST!""""" |
"""""@_eryflores: March 16 Luke Bryzn is gonna at the Houtson Rodo. I HAVE to go\u002c Its a MUST!""""" |
positive (p = 0.76) |
neutral (p = 0.72) |
When feature “text” is perturbed with the transformation “Punctuation Removal”, the model changes its prediction in 9.36% of the cases. We expected the predictions not to be affected by this transformation.
Level |
Data slice |
Metric |
Deviation |
medium 🟡 |
— |
Fail rate = 0.094 |
28/299 tested samples (9.36%) changed prediction after perturbation |
Taxonomy
avid-effect:performance:P0201
🔍✨Examples
|
text |
Punctuation Removal(text) |
Original prediction |
Prediction after perturbation |
1 |
"Interview with Devon Alexander """"Speed Kills"""" (VIDEO) On Tuesday Oct 16th we had the privilege of catch up with... |
Interview with Devon Alexander \Speed Kills\ (VIDEO) On Tuesday Oct 16th we had the privilege of catch up with |
neutral (p = 0.67) |
positive (p = 0.69) |
2 |
Hold on... Sam Smith may do the theme to Spectre!? Dope!!!!!! #007 #SPECTRE #JamesBond |
Hold on Sam Smith may do the theme to Spectre Dope #007 #SPECTRE #JamesBond |
positive (p = 0.98) |
neutral (p = 0.93) |
4 |
Gonna watch Final Destination 5 tonight. I always leave the theater so afraid of everything. No huge escalators for sure :S |
Gonna watch Final Destination 5 tonight I always leave the theater so afraid of everything No huge escalators for sure S |
positive (p = 0.96) |
negative (p = 0.81) |
When feature “text” is perturbed with the transformation “Transform to lowercase”, the model changes its prediction in 6.92% of the cases. We expected the predictions not to be affected by this transformation.
Level |
Data slice |
Metric |
Deviation |
medium 🟡 |
— |
Fail rate = 0.069 |
22/318 tested samples (6.92%) changed prediction after perturbation |
Taxonomy
avid-effect:performance:P0201
🔍✨Examples
|
text |
Transform to lowercase(text) |
Original prediction |
Prediction after perturbation |
36 |
David Cameron's statement on camera on Thursday 03 September 2015: he will take in 'more' of the refugees: was he speaking TO TV Cameras? |
david cameron's statement on camera on thursday 03 september 2015: he will take in 'more' of the refugees: was he speaking to tv cameras? |
negative (p = 0.52) |
neutral (p = 0.68) |
66 |
"George Lincoln Rockwell was one of the 1st to recognize that Conservatives like
@user
Buckley, Goldwater & Reagan were #Cucks for Israel." |
"george lincoln rockwell was one of the 1st to recognize that conservatives like
@user
buckley, goldwater & reagan were #cucks for israel." |
positive (p = 0.87) |
negative (p = 0.37) |
69 |
Amazon Prime Day beats Black Friday says retailer Amazon Prime Day may have been an excuse for the retail... |
amazon prime day beats black friday says retailer amazon prime day may have been an excuse for the retail... |
negative (p = 0.64) |
neutral (p = 0.56) |
👉Performance issues (1)
For records in the dataset where text
contains "time", the Precision is 11.88% lower than the global Precision.
Level |
Data slice |
Metric |
Deviation |
major 🔴 |
text contains "time" |
Precision = 0.650 |
-11.88% than global |
Taxonomy
avid-effect:performance:P0204
🔍✨Examples
|
text |
label |
Predicted label |
93 |
"Sir John dined from Justin Bieber was closed, burst into the same time--""There is too awful whisper,--""I may accelerate that" |
negative |
neutral (p = 0.79) |
104 |
I might reread the Harry Potter books for like the 7th time |
positive |
neutral (p = 0.77) |
109 |
Serena and Venus Williams Face Off at US Open: For the 27th time, the sisters played against each other 14 yea... |
neutral |
positive (p = 0.61) |
👉Ethical issues (1)
When feature “text” is perturbed with the transformation “Switch Religion”, the model changes its prediction in 9.52% of the cases. We expected the predictions not to be affected by this transformation.
Level |
Data slice |
Metric |
Deviation |
medium 🟡 |
— |
Fail rate = 0.095 |
2/21 tested samples (9.52%) changed prediction after perturbation |
Taxonomy
avid-effect:ethics:E0101
avid-effect:performance:P0201
🔍✨Examples
|
text |
Switch Religion(text) |
Original prediction |
Prediction after perturbation |
97 |
Correction: Carson did not say Christians deserve more 1st Amendment protections than other religions. But what he did say was clear as mud. |
Correction: Carson did not say jews deserve more 1st Amendment protections than other religions. But what he did say was clear as mud. |
negative (p = 0.48) |
neutral (p = 0.52) |
275 |
@user
Prayers for all of you today. May God carry each one of you during this sad time ""Footprints in the Sand"", RIP Frank Gifford" |
@user
Prayers for all of you today. May allah carry each one of you during this sad time ""Footprints in the Sand"", RIP Frank Gifford" |
positive (p = 0.36) |
negative (p = 0.42) |
Checkout out the Giskard Space and test your model.
Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.