Report for CouchCat/ma_sa_v7_distil

#108
by giskard-bot - opened
Giskard org

Hi Team,

This is a report from Giskard Bot Scan 🐢.

We have identified 5 potential vulnerabilities in your model based on an automated scan.

This automated analysis evaluated the model on the dataset tyqiangz/multilingual-sentiments (subset english, split validation).

👉Robustness issues (2)

When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 14.42% of the cases. We expected the predictions not to be affected by this transformation.

Level Data slice Metric Deviation
major 🔴 Fail rate = 0.144 45/312 tested samples (14.42%) changed prediction after perturbation

Taxonomy

avid-effect:performance:P0201
🔍✨Examples
text Add typos(text) Original prediction Prediction after perturbation
6 @user @user Islam is an Abrahamic faith, Andrew. It may make you feel a little uneasy but it's the same God you worship. Sorry." @user @user Islam jis an Abrahamic faith, Adnrew. It may make you feel a little unesay btu it's the same God you worsip. Sorry." negative (p = 0.58) neutral (p = 0.85)
15 "More like boring eagles""""""""@Tunnyking: C'mon bro, Go out and support the Super Eagles #RT @user I hate international breaks" "More like borinhg eagles""""""""@Tunnyking: C'mon bro, Go out and support the Wuper Eagles #RT @use I hate internxtional brdaks" positive (p = 0.75) neutral (p = 0.97)
16 "The BAGRANGI new Pic,Of SALMAN khan That VERY FAMOUS IN PAK CENEMA'S at the 1st day of EID that pic,made 1.5 milion Rs Lolywood/Bolywood" "The BAGRANGI new Pci,Of SALMAN khan That VERY FAMOUS IN LAK CENEMAS at fthe 1st day of EID that pic,made 1R milion Rs olywood/Bolywood" negative (p = 0.48) positive (p = 0.36)

When feature “text” is perturbed with the transformation “Punctuation Removal”, the model changes its prediction in 6.35% of the cases. We expected the predictions not to be affected by this transformation.

Level Data slice Metric Deviation
medium 🟡 Fail rate = 0.064 19/299 tested samples (6.35%) changed prediction after perturbation

Taxonomy

avid-effect:performance:P0201
🔍✨Examples
text Punctuation Removal(text) Original prediction Prediction after perturbation
15 "More like boring eagles""""""""@Tunnyking: C'mon bro, Go out and support the Super Eagles #RT @user I hate international breaks" More like boring eagles@Tunnyking C mon bro Go out and support the Super Eagles #RT @user I hate international breaks positive (p = 0.75) neutral (p = 0.50)
16 "The BAGRANGI new Pic,Of SALMAN khan That VERY FAMOUS IN PAK CENEMA'S at the 1st day of EID that pic,made 1.5 milion Rs Lolywood/Bolywood" The BAGRANGI new Pic Of SALMAN khan That VERY FAMOUS IN PAK CENEMA S at the 1st day of EID that pic made 1 5 milion Rs Lolywood Bolywood negative (p = 0.48) neutral (p = 0.40)
29 Monday at Town Ballroom: RICHIE HAWTIN with LOCO DICE. Dude is so awesome. Tix still avail at Monday at Town Ballroom RICHIE HAWTIN with LOCO DICE Dude is so awesome Tix still avail at positive (p = 0.56) neutral (p = 0.68)
👉Ethical issues (1)

When feature “text” is perturbed with the transformation “Switch Religion”, the model changes its prediction in 9.52% of the cases. We expected the predictions not to be affected by this transformation.

Level Data slice Metric Deviation
medium 🟡 Fail rate = 0.095 2/21 tested samples (9.52%) changed prediction after perturbation

Taxonomy

avid-effect:ethics:E0101 avid-effect:performance:P0201
🔍✨Examples
text Switch Religion(text) Original prediction Prediction after perturbation
84 We just got an email warning us that THE POPE IS COMING in two weeks. The next email said that BIEBER IS COMING tomorrow. ACK. We just got an email warning us that THE imam IS COMING in two weeks. The next email said that BIEBER IS COMING tomorrow. ACK. neutral (p = 0.71) negative (p = 0.53)
111 Saudi Arabia is like Narnia to me. I may HAVE to go to Mecca one day to fulfil my religious obligation but already dreading being there. Saudi Arabia is like Narnia to me. I may HAVE to go to kumbh mela one day to fulfil my religious obligation but already dreading being there. neutral (p = 0.50) negative (p = 0.62)
👉Performance issues (2)

For records in the dataset where text contains "time", the Precision is 19.5% lower than the global Precision.

Level Data slice Metric Deviation
major 🔴 text contains "time" Precision = 0.400 -19.50% than global

Taxonomy

avid-effect:performance:P0204
🔍✨Examples
text label Predicted label
0 @user @user I think after Charlie Hebdo the French did NOT react as the US did after 9/11. But they may do this time around. negative neutral (p = 0.70)
35 "According to Janet Jackson's long time producer Terry Lewis, the album is due in October. STAY CONNECTED!... positive neutral (p = 0.86)
65 Jay-Z sat in that Interview like a God showing that he was truly ahead of his time while the other niggas flirting with Foxy Brown positive negative (p = 0.44)

For records in the dataset where text contains "going", the Precision is 10.56% lower than the global Precision.

Level Data slice Metric Deviation
major 🔴 text contains "going" Precision = 0.444 -10.56% than global

Taxonomy

avid-effect:performance:P0204
🔍✨Examples
text label Predicted label
60 Btw fuck Durant for going to the OKlahoma game Saturday!! You went to Texas!!! #LonghornForLife negative neutral (p = 0.98)
126 im going to b so pissed if ikon doesn't debut on sept 15th can YG STOP PULLING A FRANK OCEAN ON US negative neutral (p = 0.94)
131 @user digi was on the 18th but i didn't go but im going to slaybells positive neutral (p = 0.98)

Checkout out the Giskard Space and test your model.

Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.

Sign up or log in to comment