Create safe fallback for models not yet initialized with masking_type f2ca6a6 verified Ruurd commited on 7 days ago
Overhaul code for appropriate masking for full model instead of just attention layers b43e862 verified Ruurd commited on 7 days ago
Implement improved attention masking for bidirectional_masked 1723639 verified Ruurd commited on 7 days ago
Fix input_ids instead of current_tokens for first noise iteration 74479ff verified Ruurd commited on 7 days ago
Change LoRA size from 256 to 512, also back to bidirectional_masked 620a6cd verified Ruurd commited on 10 days ago