2025-09-05T19:18:35Z INFO 1939 [root]: /opt/conda/bin/neuronx-cc compile /models/mistral-7b-v0.3-instruct-neuronx/layout_opt/model/graph.hlo --framework XLA --target trn1 --output /models/mistral-7b-v0.3-instruct-neuronx/layout_opt/graph.neff --model-type=transformer -O1 --lnc=1 '--internal-hlo2tensorizer-options=--experimental-unsafe-fp8e4m3fn-as-fp8e4m3 --verify-hlo=false' --logfile=/models/mistral-7b-v0.3-instruct-neuronx/layout_opt/log-neuron-cc.txt --verbose=35 2025-09-05T19:18:41Z INFO 1939 [root]: NeuronX Compiler version 2.20.9961.0+0acef03a Python version 3.10.17 HWM version 2.20.0.9961+0acef03a NumPy version 1.26.4 2025-09-05T19:18:41Z INFO 1971 [root]: XLA detected 2025-09-05T19:18:41Z INFO 1971 [root]: Pipeline: HLOToTensorizer Frontend StaticIOTranspose WalrusDriver BIRLinker Kelper NeffWrapper 2025-09-05T19:18:41Z INFO 1971 [root]: Intermediate files stored in /opt/vllm/neuronxcc-gss0wi_w, output in /opt/vllm 2025-09-05T19:18:41Z INFO 1971 [pipeline.Pipeline.0]: Job Pipeline len(in_states) 1 2025-09-05T19:18:41Z INFO 1971 [pipeline.Pipeline.0]: Processing input #0 2025-09-05T19:18:41Z INFO 1971 [pipeline.Pipeline.0]: Running pipeline Pipeline.0 2025-09-05T19:18:41Z INFO 1971 [pipeline.Pipeline.0]: Starting job job.HLOToTensorizer.0 2025-09-05T19:18:41Z INFO 1971 [job.HLOToTensorizer.0]: Job HLOToTensorizer len(in_states) 1 2025-09-05T19:18:41Z INFO 1971 [job.HLOToTensorizer.0]: Processing input #0 2025-09-05T19:18:41Z INFO 1971 [job.HLOToTensorizer.0]: IR signature: b7a76fc5f3f76d1d69d57e0e784721bafd07e3a61734f6594e8c815123a8a771 for graph.hlo 2025-09-05T19:18:41Z INFO 1971 [job.HLOToTensorizer.0]: Executing: /opt/conda/lib/python3.10/site-packages/neuronxcc/starfish/bin/hlo2penguin --input /models/mistral-7b-v0.3-instruct-neuronx/layout_opt/model/graph.hlo --out-dir ./ --output penguin.py --remat --max-costly-ops=2 --max-live-in-size=5 --max-remat-chain-size=10 --max-mem-multiple=1.8 --min-def-use-distance=500 --remat-policy=transformer --allow-same-pass-remat=true --layers-per-module=1 --partition --emit-tensor-level-dropout-ops --experimental-unsafe-fp8e4m3fn-as-fp8e4m3 --verify-hlo=false --native-to-custom-softmax --partitioner-opts='--transformer' 2025-09-05T19:18:41Z INFO 1971 [job.HLOToTensorizer.0]: DEBUG: needsModular? No. macCnt 0 num non-trivial Ops 290 INFO: Switching to single-module compile. PrePartitionPipe skipped. INFO: Found memory bound graph INFO: Number of Native SoftmaxDx's detected and replaced: 0 INFO: Number of Native Softmax's detected and replaced: 0 Replaced 0 dropout sequences with OffloadedDropout INFO: HloMacCount has found 0 INFO: Traffic has found 7248289792 INFO: AIF 0 HLO Ops used in computation: parameter reshape transpose tuple Warning: Could not open file debug_info_hlo_partitions.json 2025-09-05 19:18:41.368875: W hilo/hlo2penguin/utils/DumpDebugInfo.cc:52] Truncating long HLO operator name %last = tuple(%p68, %transpose.290, %transpose.291, %transpose.292, %transpose.293, %transpose.294, %transpose.295, %transpose.296, %transpose.297, %transpose.298, %transpose.299, %transpose.300, %transpose.301, %transpose.302, %transpose.303, %transpose.304, %transpose.305, %transpose.306, %transpose.307, %transpose.308, %transpose.309, %transpose.310, %transpose.311, %transpose.312, %transpose.313, %transpose.314, %transpose.315, %transpose.316, %transpose.317, %transpose.318, %transpose.319, %transpos... to 512 characters in the compiler's debug metadata Invoking RemoveOptimizationBarriers pass 2025-09-05T19:18:41Z INFO 1971 [job.HLOToTensorizer.0]: IR signature: 2e0dd598350cceee9da216d825f016375db09d3b92ca63d286d74eb4016c4d07 for sg0000/HLOToTensorizer 2025-09-05T19:18:41Z INFO 1971 [job.HLOToTensorizer.0]: Job #0 finished 2025-09-05T19:18:41Z INFO 1971 [pipeline.Pipeline.0]: Finished job job.HLOToTensorizer.0 2025-09-05T19:18:41Z INFO 1971 [pipeline.Pipeline.0]: Starting job job.Frontend.0 2025-09-05T19:18:41Z INFO 1971 [job.Frontend.0]: Job Frontend len(in_states) 1 2025-09-05T19:18:41Z INFO 1971 [job.Frontend.0]: Processing input #0 2025-09-05T19:18:41Z INFO 1971 [job.Frontend.0]: Start model loading 2025-09-05T19:18:41Z INFO 1971 [job.Frontend.0]: Start tensorization 2025-09-05T19:18:41Z INFO 1971 [job.Frontend.0]: Num jobs: 32 2025-09-05T19:18:41Z USER 1971 [root/Tensorizer/Tensorizer]: Running Tensorizer 2025-09-05T19:18:41Z INFO 1971 [Tensorizer]: Frontend did not find netlist info. Switching to flat flow. 2025-09-05T19:18:41Z INFO 1971 [Tensorizer]: Building model from Penguin script "penguin.py"... 2025-09-05T19:18:41Z INFO 1971 [Tensorizer]: Tensorizer options: --run-pg-layout-and-tiling --enable-dse-after-mask-propagation --disable-concat-delinearizer --num-neuroncores-per-sengine=1 --num-neuroncores-per-sengine=1 --internal_dynamic_dma_scratch_size_per_partition=16384 --disable-bitcasted-transpose --dont-verify-after-all --fp32-cast=matmult-bf16 --mm-transpose-type=fp32 --disable-expensive-checks --disable-max-stride-tiling --hbm-scratchpad-page-size-in-bytes=536870912 --enable-replication --max-local-tensor-tile-size-in-bytes=32768 --tensor-layout-p-order=0 --tensor-layout-b-order=1 --enable-advanced-delinearization --weight-coalescing-threshold=512 --enable-bir-converter=enable --enable-tritium-loopfusion --enable-softmax-kernel --model-type-transformer --enable-isl-in-injective-check --enable-dge-on-io-dma --enable-dge-on-indirect-dma --enable-dge-on-vector-indirect-dma --keep-rng-tensor-op 2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/DoNothing]: Running DoNothing 2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/DoNothing]: Finished (changed=True) 2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/DoNothing]: DoNothing finished after 0.000 seconds 2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/LegalizeOpLevelAlias]: Running LegalizeOpLevelAlias 2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/LegalizeOpLevelAlias]: Finished (changed=False) 2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/LegalizeOpLevelAlias]: LegalizeOpLevelAlias finished after 0.003 seconds 2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/OptimizeAliasedCopyChain]: Running OptimizeAliasedCopyChain 2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/OptimizeAliasedCopyChain]: Finished (changed=False) 2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/OptimizeAliasedCopyChain]: OptimizeAliasedCopyChain finished after 0.004 seconds 2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/AliasDependencyInduction]: Running AliasDependencyInduction 2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/AliasDependencyInduction]: Finished (changed=True) 2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/AliasDependencyInduction]: AliasDependencyInduction finished after 0.032 seconds 2025-09-05T19:18:41Z INFO 1971 [sg0000/Tensorizer/TransformConvOp]: Running TransformConvOp 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TransformConvOp]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TransformConvOp]: TransformConvOp finished after 0.011 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LowerTensorOp]: Running LowerTensorOp 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LowerTensorOp]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LowerTensorOp]: LowerTensorOp finished after 0.004 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyReset]: Running AliasDependencyReset 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyElimination]: Running AliasDependencyElimination 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyElimination]: Finished (changed=True) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyElimination]: AliasDependencyElimination finished after 0.003 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyInduction]: Running AliasDependencyInduction 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyInduction]: Finished (changed=True) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyInduction]: AliasDependencyInduction finished after 0.033 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyReset]: AliasDependencyReset finished after 0.047 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TensorOpSimplifier]: Running TensorOpSimplifier 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TensorOpSimplifier]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TensorOpSimplifier]: TensorOpSimplifier finished after 0.014 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/CanonicalizeIR]: Running CanonicalizeIR 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/CanonicalizeIR]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/CanonicalizeIR]: CanonicalizeIR finished after 0.003 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LegalizeCCOpLayout]: Running LegalizeCCOpLayout 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LegalizeCCOpLayout]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LegalizeCCOpLayout]: LegalizeCCOpLayout finished after 0.004 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ResolveComplicatePredicates]: Running ResolveComplicatePredicates 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ResolveComplicatePredicates]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ResolveComplicatePredicates]: ResolveComplicatePredicates finished after 0.003 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AffinePredicateResolution]: Running AffinePredicateResolution 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AffinePredicateResolution]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AffinePredicateResolution]: AffinePredicateResolution finished after 0.003 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/EliminateDivs]: Running EliminateDivs 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/EliminateDivs]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/EliminateDivs]: EliminateDivs finished after 0.003 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PerfectLoopNest]: Running PerfectLoopNest 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PerfectLoopNest]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PerfectLoopNest]: PerfectLoopNest finished after 0.003 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Running Simplifier 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Finished (changed=True) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Simplifier finished after 0.068 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/GenericAccessSimplifier]: Running GenericAccessSimplifier 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/GenericAccessSimplifier]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/GenericAccessSimplifier]: GenericAccessSimplifier finished after 0.003 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TCTransform]: Running TCTransform 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TCTransform]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TCTransform]: TCTransform finished after 0.003 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/CommuteConcat]: Running CommuteConcat 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/CommuteConcat]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/CommuteConcat]: CommuteConcat finished after 0.003 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ExpandBatchNorm]: Running ExpandBatchNorm 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ExpandBatchNorm]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ExpandBatchNorm]: ExpandBatchNorm finished after 0.006 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TCTransform]: Running TCTransform 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TCTransform]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TCTransform]: TCTransform finished after 0.003 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/EliminateDivs]: Running EliminateDivs 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/EliminateDivs]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/EliminateDivs]: EliminateDivs finished after 0.003 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/GenericAccessSimplifier]: Running GenericAccessSimplifier 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/GenericAccessSimplifier]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/GenericAccessSimplifier]: GenericAccessSimplifier finished after 0.003 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TensorOpTransform]: Running TensorOpTransform 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TensorOpTransform]: Finished (changed=True) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TensorOpTransform]: TensorOpTransform finished after 0.060 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LateLowerTensorOp]: Running LateLowerTensorOp 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LateLowerTensorOp]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LateLowerTensorOp]: LateLowerTensorOp finished after 0.005 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyReset]: Running AliasDependencyReset 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyElimination]: Running AliasDependencyElimination 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyElimination]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyElimination]: AliasDependencyElimination finished after 0.000 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyInduction]: Running AliasDependencyInduction 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyInduction]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyInduction]: AliasDependencyInduction finished after 0.005 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AliasDependencyReset]: AliasDependencyReset finished after 0.016 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MemcpyElimination]: Running MemcpyElimination 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MemcpyElimination]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MemcpyElimination]: MemcpyElimination finished after 0.001 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LoopFusion]: Running LoopFusion 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LoopFusion]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LoopFusion]: LoopFusion finished after 0.003 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Rematerialization]: Running Rematerialization 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Rematerialization]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Rematerialization]: Rematerialization finished after 0.002 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Running Simplifier 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Simplifier finished after 0.001 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Running Delinearization 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Delinearization finished after 0.002 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DeadStoreElimination]: Running DeadStoreElimination 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DeadStoreElimination]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DeadStoreElimination]: DeadStoreElimination finished after 0.002 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Running Simplifier 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Simplifier finished after 0.001 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: Running LICM 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: LICM finished after 0.001 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Running Delinearization 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Delinearization finished after 0.002 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LoopFusion]: Running LoopFusion 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LoopFusion]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LoopFusion]: LoopFusion finished after 0.002 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/SimplifySlice]: Running SimplifySlice 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/SimplifySlice]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/SimplifySlice]: SimplifySlice finished after 0.001 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: Running LICM 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: LICM finished after 0.001 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Running Simplifier 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Simplifier finished after 0.001 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ValueNumbering]: Running ValueNumbering 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ValueNumbering]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ValueNumbering]: ValueNumbering finished after 0.001 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: Running LICM 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: LICM finished after 0.001 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PadElimination]: Running PadElimination 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PadElimination]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PadElimination]: PadElimination finished after 0.001 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Running Delinearization 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Delinearization finished after 0.002 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LoopFusion]: Running LoopFusion 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LoopFusion]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LoopFusion]: LoopFusion finished after 0.003 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/GenericAccessSimplifier]: Running GenericAccessSimplifier 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/GenericAccessSimplifier]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/GenericAccessSimplifier]: GenericAccessSimplifier finished after 0.001 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Running Simplifier 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Simplifier finished after 0.001 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: Running LICM 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: LICM finished after 0.001 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ValueNumbering]: Running ValueNumbering 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ValueNumbering]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ValueNumbering]: ValueNumbering finished after 0.001 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TCTransform]: Running TCTransform 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TCTransform]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/TCTransform]: TCTransform finished after 0.001 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/CommuteConcat]: Running CommuteConcat 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/CommuteConcat]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/CommuteConcat]: CommuteConcat finished after 0.001 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/RecognizeOpIdiom]: Running RecognizeOpIdiom 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/RecognizeOpIdiom]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/RecognizeOpIdiom]: RecognizeOpIdiom finished after 0.001 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MaskPropagation]: Running MaskPropagation 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MaskPropagation]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MaskPropagation]: MaskPropagation finished after 0.002 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DeadStoreElimination]: Running DeadStoreElimination 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DeadStoreElimination]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DeadStoreElimination]: DeadStoreElimination finished after 0.002 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Recompute]: Running Recompute 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Recompute]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Recompute]: Recompute finished after 0.000 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DeadCodeElimination]: Running DeadCodeElimination 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DeadCodeElimination]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DeadCodeElimination]: DeadCodeElimination finished after 0.001 seconds 2025-09-05T19:18:42Z INFO 1971 [Tensorizer]: After optimization: 290 statements 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DoNothing]: Running DoNothing 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DoNothing]: Finished (changed=True) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DoNothing]: DoNothing finished after 0.000 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MutateDataType]: Running MutateDataType 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MutateDataType]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MutateDataType]: MutateDataType finished after 0.003 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AutoCastTCInputs]: Running AutoCastTCInputs 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AutoCastTCInputs]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AutoCastTCInputs]: AutoCastTCInputs finished after 0.002 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/GenericAccessSimplifier]: Running GenericAccessSimplifier 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/GenericAccessSimplifier]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/GenericAccessSimplifier]: GenericAccessSimplifier finished after 0.001 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Running Simplifier 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Simplifier]: Simplifier finished after 0.001 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DelinearIndices]: Running DelinearIndices 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DelinearIndices]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DelinearIndices]: DelinearIndices finished after 0.002 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Running Delinearization 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Delinearization finished after 0.002 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DelinearIndices]: Running DelinearIndices 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DelinearIndices]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DelinearIndices]: DelinearIndices finished after 0.002 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DeadCodeElimination]: Running DeadCodeElimination 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DeadCodeElimination]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DeadCodeElimination]: DeadCodeElimination finished after 0.001 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LateLowerReshapeOp]: Running LateLowerReshapeOp 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LateLowerReshapeOp]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LateLowerReshapeOp]: LateLowerReshapeOp finished after 0.003 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InferIntrinsicOnCC]: Running InferIntrinsicOnCC 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InferIntrinsicOnCC]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InferIntrinsicOnCC]: InferIntrinsicOnCC finished after 0.007 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ResolveAccessConflict]: Running ResolveAccessConflict 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ResolveAccessConflict]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ResolveAccessConflict]: ResolveAccessConflict finished after 0.004 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: Running LICM 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LICM]: LICM finished after 0.001 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LocalLayoutOpt]: Running LocalLayoutOpt 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LocalLayoutOpt]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LocalLayoutOpt]: LocalLayoutOpt finished after 0.038 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DelinearIndices]: Running DelinearIndices 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DelinearIndices]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DelinearIndices]: DelinearIndices finished after 0.002 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PGLayoutTilingPipeline]: Running PGLayoutTilingPipeline 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LayoutPreprocessingAndAnalysis]: Running LayoutPreprocessingAndAnalysis 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LayoutPreprocessing]: Running LayoutPreprocessing 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Running Delinearization 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/Delinearization]: Delinearization finished after 0.002 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LayoutPreprocessing]: Finished (changed=True) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LayoutPreprocessing]: LayoutPreprocessing finished after 0.019 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LayoutRequirementAnalysis]: Running LayoutRequirementAnalysis 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LayoutRequirementAnalysis]: LayoutRequirementAnalysis finished after 0.005 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LayoutPreprocessingAndAnalysis]: LayoutPreprocessingAndAnalysis finished after 0.035 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InferNonlocalTensors]: Running InferNonlocalTensors 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InferNonlocalTensors]: prefer_non_broadcast_par: True 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InferNonlocalTensors]: prefer_non_broadcast_par: True 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InferNonlocalTensors]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InferNonlocalTensors]: InferNonlocalTensors finished after 0.017 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PAGLayoutOpt]: Running PAGLayoutOpt 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ParAxesAnnotation]: Running ParAxesAnnotation 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LayoutSearchAlgorithm]: prefer_non_broadcast_par: True 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ParAxesAnnotation]: Finished (changed=True) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/ParAxesAnnotation]: ParAxesAnnotation finished after 0.016 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InsertLocalTransposes]: Running InsertLocalTransposes 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InsertLocalTransposes]: Finished (changed=True) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InsertLocalTransposes]: InsertLocalTransposes finished after 0.004 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PAGLayoutOpt]: PAGLayoutOpt finished after 0.031 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MaskPropagation]: Running MaskPropagation 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MaskPropagation]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MaskPropagation]: MaskPropagation finished after 0.002 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/CanonicalizeDAGForPGTiling]: Running CanonicalizeDAGForPGTiling 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/CanonicalizeDAGForPGTiling]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/CanonicalizeDAGForPGTiling]: CanonicalizeDAGForPGTiling finished after 0.002 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PGTiling]: Running PGTiling 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AGOrderingAnalysisPass]: Running AGOrderingAnalysisPass 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/AGOrderingAnalysisPass]: AGOrderingAnalysisPass finished after 0.048 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/StaticTransposeLocalTensor]: Running StaticTransposeLocalTensor 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/StaticTransposeLocalTensor]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/StaticTransposeLocalTensor]: StaticTransposeLocalTensor finished after 0.003 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PComputeCutting]: Running PComputeCutting 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PComputeCutting]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PComputeCutting]: PComputeCutting finished after 0.007 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/BFComputeCutting]: Running BFComputeCutting 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/BFComputeCutting]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/BFComputeCutting]: BFComputeCutting finished after 0.003 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LoopSplitting]: Running LoopSplitting 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LoopSplitting]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/LoopSplitting]: LoopSplitting finished after 0.001 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MacroGeneration]: Running MacroGeneration 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MacroGeneration]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/MacroGeneration]: MacroGeneration finished after 0.019 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/PGTiling]: PGTiling finished after 0.114 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InsertIOTransposes]: Running InsertIOTransposes 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InsertIOTransposes]: Finished (changed=True) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InsertIOTransposes]: InsertIOTransposes finished after 0.002 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InsertOffloadedTransposes]: Running InsertOffloadedTransposes 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InsertOffloadedTransposes]: Finished (changed=False) 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/InsertOffloadedTransposes]: InsertOffloadedTransposes finished after 0.001 seconds 2025-09-05T19:18:42Z INFO 1971 [sg0000/Tensorizer/DramToDramTranspose]: Running DramToDramTranspose 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/DramToDramTranspose]: Finished (changed=True) 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/DramToDramTranspose]: DramToDramTranspose finished after 14.818 seconds 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/PGLayoutTilingPipeline]: PGLayoutTilingPipeline finished after 15.072 seconds 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingProfiler]: Running TilingProfiler 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 20 MACROS WITH LARGEST INSTRUCTION COUNTS: 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 4096: transpose_128x128 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingProfiler]: Finished (changed=False) 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/TilingProfiler]: TilingProfiler finished after 0.174 seconds 2025-09-05T19:18:57Z INFO 1971 [sg0000/Tensorizer/FlattenMacroLoop]: Running FlattenMacroLoop 2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/FlattenMacroLoop]: Finished (changed=True) 2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/FlattenMacroLoop]: FlattenMacroLoop finished after 0.101 seconds 2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/InferNeuronTensor]: Running InferNeuronTensor 2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/InferNeuronTensor]: Finished (changed=True) 2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/InferNeuronTensor]: InferNeuronTensor finished after 0.654 seconds 2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifier]: Running NeuronSimplifier 2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifier]: Finished (changed=False) 2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifier]: NeuronSimplifier finished after 0.101 seconds 2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/LICM]: Running LICM 2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/LICM]: Finished (changed=False) 2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/LICM]: LICM finished after 0.030 seconds 2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/RewriteReplicationMatmul]: Running RewriteReplicationMatmul 2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/RewriteReplicationMatmul]: Finished (changed=False) 2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/RewriteReplicationMatmul]: RewriteReplicationMatmul finished after 0.023 seconds 2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/FlattenMacroLoop]: Running FlattenMacroLoop 2025-09-05T19:18:58Z INFO 1971 [sg0000/Tensorizer/FlattenMacroLoop]: Finished (changed=False) 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/FlattenMacroLoop]: FlattenMacroLoop finished after 0.073 seconds 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/SimplifyMacroPredicates]: Running SimplifyMacroPredicates 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/SimplifyMacroPredicates]: Finished (changed=False) 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/SimplifyMacroPredicates]: SimplifyMacroPredicates finished after 0.084 seconds 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/DataLocalityOpt]: Running DataLocalityOpt 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/DataLocalityOpt]: Finished (changed=True) 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/DataLocalityOpt]: DataLocalityOpt finished after 0.148 seconds 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/DMATilingProfiler]: Running DMATilingProfiler 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 20 MACROS WITH LARGEST INSTRUCTION COUNTS: 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 4096: transpose_128x128 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PostDLOTilingBottleneck]: 1792: transpose_128x128 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/DMATilingProfiler]: Finished (changed=False) 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/DMATilingProfiler]: DMATilingProfiler finished after 0.029 seconds 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifier]: Running NeuronSimplifier 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifier]: Finished (changed=False) 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifier]: NeuronSimplifier finished after 0.107 seconds 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/LegalizeSundaMacro]: Running LegalizeSundaMacro 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/LegalizeSundaMacro]: Finished (changed=False) 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/LegalizeSundaMacro]: LegalizeSundaMacro finished after 0.053 seconds 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifier]: Running NeuronSimplifier 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifier]: Finished (changed=False) 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifier]: NeuronSimplifier finished after 0.106 seconds 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PerfectLoopNest]: Running PerfectLoopNest 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PerfectLoopNest]: Finished (changed=False) 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/PerfectLoopNest]: PerfectLoopNest finished after 0.022 seconds 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/FlattenMacroLoop]: Running FlattenMacroLoop 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/FlattenMacroLoop]: Finished (changed=True) 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/FlattenMacroLoop]: FlattenMacroLoop finished after 0.080 seconds 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/RewriteWeights]: Running RewriteWeights 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/RewriteWeights]: Finished (changed=False) 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/RewriteWeights]: RewriteWeights finished after 0.018 seconds 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/ReshapeWeights]: Running ReshapeWeights 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/ReshapeWeights]: Finished (changed=False) 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/ReshapeWeights]: ReshapeWeights finished after 0.006 seconds 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/FlattenMacroLoop]: Running FlattenMacroLoop 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/FlattenMacroLoop]: Finished (changed=False) 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/FlattenMacroLoop]: FlattenMacroLoop finished after 0.066 seconds 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/SimplifyMacroPredicates]: Running SimplifyMacroPredicates 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/SimplifyMacroPredicates]: Finished (changed=False) 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/SimplifyMacroPredicates]: SimplifyMacroPredicates finished after 0.087 seconds 2025-09-05T19:18:59Z INFO 1971 [sg0000/Tensorizer/InferInitValue]: Running InferInitValue 2025-09-05T19:19:00Z INFO 1971 [sg0000/Tensorizer/InferInitValue]: Finished (changed=True) 2025-09-05T19:19:00Z INFO 1971 [sg0000/Tensorizer/InferInitValue]: InferInitValue finished after 0.437 seconds 2025-09-05T19:19:00Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifier]: Running NeuronSimplifier 2025-09-05T19:19:00Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifier]: Finished (changed=False) 2025-09-05T19:19:00Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifier]: NeuronSimplifier finished after 0.109 seconds 2025-09-05T19:19:00Z INFO 1971 [sg0000/Tensorizer/SimplifyTensor]: Running SimplifyTensor 2025-09-05T19:19:00Z INFO 1971 [sg0000/Tensorizer/SimplifyTensor]: Finished (changed=False) 2025-09-05T19:19:00Z INFO 1971 [sg0000/Tensorizer/SimplifyTensor]: SimplifyTensor finished after 0.068 seconds 2025-09-05T19:19:00Z INFO 1971 [sg0000/Tensorizer/LICM]: Running LICM 2025-09-05T19:19:00Z INFO 1971 [sg0000/Tensorizer/LICM]: Finished (changed=False) 2025-09-05T19:19:00Z INFO 1971 [sg0000/Tensorizer/LICM]: LICM finished after 0.031 seconds 2025-09-05T19:19:00Z INFO 1971 [sg0000/Tensorizer/SundaISel]: Running SundaISel 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/SundaISel]: Finished (changed=True) 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/SundaISel]: SundaISel finished after 0.472 seconds 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronAliasDependencyReset]: Running NeuronAliasDependencyReset 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/AliasDependencyElimination]: Running AliasDependencyElimination 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/AliasDependencyElimination]: Finished (changed=False) 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/AliasDependencyElimination]: AliasDependencyElimination finished after 0.000 seconds 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronAliasDependencyInduction]: Running NeuronAliasDependencyInduction 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronAliasDependencyInduction]: Finished (changed=True) 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronAliasDependencyInduction]: NeuronAliasDependencyInduction finished after 0.039 seconds 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronAliasDependencyReset]: NeuronAliasDependencyReset finished after 0.051 seconds 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/LowerComplexBroadcast]: Running LowerComplexBroadcast 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/LowerComplexBroadcast]: Finished (changed=False) 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/LowerComplexBroadcast]: LowerComplexBroadcast finished after 0.022 seconds 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronLoopInterchange]: Running NeuronLoopInterchange 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronLoopInterchange]: Finished (changed=False) 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronLoopInterchange]: NeuronLoopInterchange finished after 0.018 seconds 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifyPredicates]: Running NeuronSimplifyPredicates 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifyPredicates]: Finished (changed=False) 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifyPredicates]: NeuronSimplifyPredicates finished after 0.014 seconds 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronLoopFusion]: Running NeuronLoopFusion 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronLoopFusion]: Finished (changed=True) 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronLoopFusion]: NeuronLoopFusion finished after 0.076 seconds 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronLoopInterchange]: Running NeuronLoopInterchange 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronLoopInterchange]: Finished (changed=False) 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronLoopInterchange]: NeuronLoopInterchange finished after 0.018 seconds 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronLICM]: Running NeuronLICM 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronLICM]: Finished (changed=False) 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronLICM]: NeuronLICM finished after 0.070 seconds 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/FactorizeBlkDims]: Running FactorizeBlkDims 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/FactorizeBlkDims]: Finished (changed=False) 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/FactorizeBlkDims]: FactorizeBlkDims finished after 0.092 seconds 2025-09-05T19:19:01Z INFO 1971 [sg0000/Tensorizer/NeuronInstComb]: Running NeuronInstComb 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/NeuronInstComb]: Finished (changed=True) 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/NeuronInstComb]: NeuronInstComb finished after 1.250 seconds 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/NeuronValueNumbering]: Running NeuronValueNumbering 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/NeuronValueNumbering]: Finished (changed=False) 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/NeuronValueNumbering]: NeuronValueNumbering finished after 0.035 seconds 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/NeuronInstComb]: Running NeuronInstComb 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/NeuronInstComb]: Finished (changed=False) 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/NeuronInstComb]: NeuronInstComb finished after 0.016 seconds 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/VectorizeDMA]: Running VectorizeDMA 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/VectorizeDMA]: Finished (changed=False) 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/VectorizeDMA]: VectorizeDMA finished after 0.023 seconds 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifyPredicates]: Running NeuronSimplifyPredicates 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifyPredicates]: Finished (changed=False) 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifyPredicates]: NeuronSimplifyPredicates finished after 0.009 seconds 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/LegalizePartitionReduce]: Running LegalizePartitionReduce 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/LegalizePartitionReduce]: Finished (changed=False) 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/LegalizePartitionReduce]: LegalizePartitionReduce finished after 0.008 seconds 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/DeConcat]: Running DeConcat 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/DeConcat]: Finished (changed=False) 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/DeConcat]: DeConcat finished after 0.001 seconds 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/FactorizeThreadAxesInFreeDims]: Running FactorizeThreadAxesInFreeDims 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/FactorizeThreadAxesInFreeDims]: Finished (changed=False) 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/FactorizeThreadAxesInFreeDims]: FactorizeThreadAxesInFreeDims finished after 0.017 seconds 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/PartialSimdFusion]: Running PartialSimdFusion 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/PartialSimdFusion]: Finished (changed=False) 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/PartialSimdFusion]: PartialSimdFusion finished after 0.007 seconds 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/TritiumFusion]: Running TritiumFusion 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/TritiumFusion]: Finished (changed=False) 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/TritiumFusion]: TritiumFusion finished after 0.008 seconds 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/CCOpFusion]: Running CCOpFusion 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/CCOpFusion]: Finished (changed=False) 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/CCOpFusion]: CCOpFusion finished after 0.069 seconds 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/VectorizeMatMult]: Running VectorizeMatMult 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/VectorizeMatMult]: Finished (changed=False) 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/VectorizeMatMult]: VectorizeMatMult finished after 0.004 seconds 2025-09-05T19:19:02Z INFO 1971 [sg0000/Tensorizer/PartialLoopFusion]: Running PartialLoopFusion 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/PartialLoopFusion]: Finished (changed=False) 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/PartialLoopFusion]: PartialLoopFusion finished after 0.122 seconds 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/NeuronLICM]: Running NeuronLICM 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/NeuronLICM]: Finished (changed=False) 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/NeuronLICM]: NeuronLICM finished after 0.038 seconds 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LowerTranspose]: Running LowerTranspose 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LowerTranspose]: Finished (changed=True) 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LowerTranspose]: LowerTranspose finished after 0.415 seconds 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LowerBroadcast]: Running LowerBroadcast 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LowerBroadcast]: Finished (changed=False) 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LowerBroadcast]: LowerBroadcast finished after 0.016 seconds 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LateNeuronInstComb]: Running LateNeuronInstComb 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LateNeuronInstComb]: Finished (changed=True) 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LateNeuronInstComb]: LateNeuronInstComb finished after 0.107 seconds 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/SplitAccGrp]: Running SplitAccGrp 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/SplitAccGrp]: Finished (changed=False) 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/SplitAccGrp]: SplitAccGrp finished after 0.013 seconds 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/SpillPSum]: Running SpillPSum 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/SpillPSum]: Finished (changed=False) 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/SpillPSum]: SpillPSum finished after 0.120 seconds 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LowerIntrinsics]: Running LowerIntrinsics 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LowerIntrinsics]: Finished (changed=False) 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LowerIntrinsics]: LowerIntrinsics finished after 0.015 seconds 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/InlineNativeKernels]: Running InlineNativeKernels 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/InlineNativeKernels]: Finished (changed=False) 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/InlineNativeKernels]: InlineNativeKernels finished after 0.012 seconds 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LegalizeType]: Running LegalizeType 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LegalizeType]: Finished (changed=True) 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/LegalizeType]: LegalizeType finished after 0.089 seconds 2025-09-05T19:19:03Z INFO 1971 [sg0000/Tensorizer/NeuronLICM]: Running NeuronLICM 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/NeuronLICM]: Finished (changed=False) 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/NeuronLICM]: NeuronLICM finished after 0.063 seconds 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/InferPSumTensor]: Running InferPSumTensor 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/InferPSumTensor]: Finished (changed=False) 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/InferPSumTensor]: InferPSumTensor finished after 0.137 seconds 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/WeightCoalescing]: Running WeightCoalescing 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/WeightCoalescing]: Finished (changed=False) 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/WeightCoalescing]: WeightCoalescing finished after 0.013 seconds 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/LegalizeSundaAccess]: Running LegalizeSundaAccess 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/LegalizeSundaAccess]: Finished (changed=False) 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/LegalizeSundaAccess]: LegalizeSundaAccess finished after 0.114 seconds 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/RelaxPredicates]: Running RelaxPredicates 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/RelaxPredicates]: Finished (changed=False) 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/RelaxPredicates]: RelaxPredicates finished after 0.029 seconds 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/TensorInitialization]: Running TensorInitialization 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/TensorInitialization]: Finished (changed=False) 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/TensorInitialization]: TensorInitialization finished after 0.015 seconds 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifyPredicates]: Running NeuronSimplifyPredicates 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifyPredicates]: Finished (changed=False) 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/NeuronSimplifyPredicates]: NeuronSimplifyPredicates finished after 0.014 seconds 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/ExpandISAMacro]: Running ExpandISAMacro 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/ExpandISAMacro]: Finished (changed=False) 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/ExpandISAMacro]: ExpandISAMacro finished after 0.025 seconds 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/SimplifyNeuronTensor]: Running SimplifyNeuronTensor 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/SimplifyNeuronTensor]: Finished (changed=False) 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/SimplifyNeuronTensor]: SimplifyNeuronTensor finished after 0.050 seconds 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/DMALocalityOpt]: Running DMALocalityOpt 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/DMALocalityOpt]: Finished (changed=False) 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/DMALocalityOpt]: DMALocalityOpt finished after 0.010 seconds 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/DataStreaming]: Running DataStreaming 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/DataStreaming]: Finished (changed=False) 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/DataStreaming]: DataStreaming finished after 0.026 seconds 2025-09-05T19:19:04Z INFO 1971 [sg0000/Tensorizer/SFKVectorizer]: Running SFKVectorizer 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/SFKVectorizer]: Finished (changed=True) 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/SFKVectorizer]: SFKVectorizer finished after 2.728 seconds 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/LateLegalizeInst]: Running LateLegalizeInst 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/LateLegalizeInst]: Finished (changed=False) 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/LateLegalizeInst]: LateLegalizeInst finished after 0.050 seconds 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/CoalesceCCOp]: Running CoalesceCCOp 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/CoalesceCCOp]: Finished (changed=False) 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/CoalesceCCOp]: CoalesceCCOp finished after 0.015 seconds 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/SimpleAllReduceTiling]: Running SimpleAllReduceTiling 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/SimpleAllReduceTiling]: Finished (changed=False) 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/SimpleAllReduceTiling]: SimpleAllReduceTiling finished after 0.015 seconds 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Running DMAProfiler 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Top 10 (estimated) latency DMAs: 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 583.920us (128.000MiB, est bw: 229.856GB/s, 1.157% of tot. time) for bfloat16<128 x 4096> TongaSB partitions[1] bfloat16 (128, 128, 4096) %'19629.24343'[T_i0,i0.128,i1.4096] = load bfloat16<128 x 4096> {'CrossPassTensor': ''}bfloat16 (128, 128, 4096) %'input289'[T_i0,i0.128,i1.4096] # id=23881, src_id=None, , instances=128 # dl = tensor_op_name: t4051_pftranspose_19629 | hlo_id: 2097 | [[i0.128];[i1.4096]] -> [[i0.128];[i1.4096]] 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 349.240us (128.000MiB, est bw: 384.314GB/s, 0.692% of tot. time) for bfloat16<128 x 4096> bfloat16 (128, 128, 32, 128) %'output289'[16i0_18312_0+i0_18312_1,i0.128,i2.32,i1.128] = store bfloat16<128 x 4096> TongaSB partitions[1] bfloat16 (128, 128, 4096) %'t4051_pftranspose_19629'[16i0_18312_0+i0_18312_1,i0.128,i1.128+128i2.32] # id=21501, src_id=None, , instances=128 # dl = tensor_op_name: _transpose.578 | hlo_id: 2097 | [[i0.128];[i1.128, i2.32]] -> [[i0.128];[i1.128, i2.32]] 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 262.710us (56.000MiB, est bw: 223.518GB/s, 0.521% of tot. time) for bfloat16<128 x 3584> TongaSB partitions[2] bfloat16 (32, 2, 128, 3584) %'18341.23903'[T_i0,T_i2_26087,i0.128,i1.3584] = load bfloat16<128 x 3584> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3584) %'input6'[T_i0,i0.128,T_i2_26087,i1.3584] # id=22057, src_id=None, , instances=64 # dl = tensor_op_name: t2070_pftranspose_18341 | hlo_id: 1531 | [[i0.128];[i1.3584]] -> [[i0.128];[i1.3584]] 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 262.710us (56.000MiB, est bw: 223.518GB/s, 0.521% of tot. time) for bfloat16<128 x 3584> TongaSB partitions[2] bfloat16 (32, 2, 128, 3584) %'18382.23917'[T_i0,T_i2_26095,i0.128,i1.3584] = load bfloat16<128 x 3584> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3584) %'input15'[T_i0,i0.128,T_i2_26095,i1.3584] # id=22115, src_id=None, , instances=64 # dl = tensor_op_name: t2133_pftranspose_18382 | hlo_id: 1549 | [[i0.128];[i1.3584]] -> [[i0.128];[i1.3584]] 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 262.710us (56.000MiB, est bw: 223.518GB/s, 0.521% of tot. time) for bfloat16<128 x 3584> TongaSB partitions[2] bfloat16 (32, 2, 128, 3584) %'18423.23931'[T_i0,T_i2_26103,i0.128,i1.3584] = load bfloat16<128 x 3584> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3584) %'input24'[T_i0,i0.128,T_i2_26103,i1.3584] # id=22173, src_id=None, , instances=64 # dl = tensor_op_name: t2196_pftranspose_18423 | hlo_id: 1567 | [[i0.128];[i1.3584]] -> [[i0.128];[i1.3584]] 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 262.710us (56.000MiB, est bw: 223.518GB/s, 0.521% of tot. time) for bfloat16<128 x 3584> TongaSB partitions[2] bfloat16 (32, 2, 128, 3584) %'18464.23945'[T_i0,T_i2_26111,i0.128,i1.3584] = load bfloat16<128 x 3584> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3584) %'input33'[T_i0,i0.128,T_i2_26111,i1.3584] # id=22231, src_id=None, , instances=64 # dl = tensor_op_name: t2259_pftranspose_18464 | hlo_id: 1585 | [[i0.128];[i1.3584]] -> [[i0.128];[i1.3584]] 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 262.710us (56.000MiB, est bw: 223.518GB/s, 0.521% of tot. time) for bfloat16<128 x 3584> TongaSB partitions[2] bfloat16 (32, 2, 128, 3584) %'18505.23959'[T_i0,T_i2_26119,i0.128,i1.3584] = load bfloat16<128 x 3584> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3584) %'input42'[T_i0,i0.128,T_i2_26119,i1.3584] # id=22289, src_id=None, , instances=64 # dl = tensor_op_name: t2322_pftranspose_18505 | hlo_id: 1603 | [[i0.128];[i1.3584]] -> [[i0.128];[i1.3584]] 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 262.710us (56.000MiB, est bw: 223.518GB/s, 0.521% of tot. time) for bfloat16<128 x 3584> TongaSB partitions[2] bfloat16 (32, 2, 128, 3584) %'18546.23973'[T_i0,T_i2_26127,i0.128,i1.3584] = load bfloat16<128 x 3584> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3584) %'input51'[T_i0,i0.128,T_i2_26127,i1.3584] # id=22347, src_id=None, , instances=64 # dl = tensor_op_name: t2385_pftranspose_18546 | hlo_id: 1621 | [[i0.128];[i1.3584]] -> [[i0.128];[i1.3584]] 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 262.710us (56.000MiB, est bw: 223.518GB/s, 0.521% of tot. time) for bfloat16<128 x 3584> TongaSB partitions[2] bfloat16 (32, 2, 128, 3584) %'18587.23987'[T_i0,T_i2_26135,i0.128,i1.3584] = load bfloat16<128 x 3584> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3584) %'input60'[T_i0,i0.128,T_i2_26135,i1.3584] # id=22405, src_id=None, , instances=64 # dl = tensor_op_name: t2448_pftranspose_18587 | hlo_id: 1639 | [[i0.128];[i1.3584]] -> [[i0.128];[i1.3584]] 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Est. DMA time: 262.710us (56.000MiB, est bw: 223.518GB/s, 0.521% of tot. time) for bfloat16<128 x 3584> TongaSB partitions[2] bfloat16 (32, 2, 128, 3584) %'18628.24001'[T_i0,T_i2_26143,i0.128,i1.3584] = load bfloat16<128 x 3584> {'CrossPassTensor': ''}bfloat16 (32, 128, 2, 3584) %'input69'[T_i0,i0.128,T_i2_26143,i1.3584] # id=22463, src_id=None, , instances=64 # dl = tensor_op_name: t2511_pftranspose_18628 | hlo_id: 1657 | [[i0.128];[i1.3584]] -> [[i0.128];[i1.3584]] 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: Finished (changed=False) 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DMAProfiler]: DMAProfiler finished after 0.029 seconds 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/OptimizeNKIKernels]: Running OptimizeNKIKernels 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/OptimizeNKIKernels]: Finished (changed=False) 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/OptimizeNKIKernels]: OptimizeNKIKernels finished after 0.015 seconds 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/CCOpFusion]: Running CCOpFusion 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/CCOpFusion]: Finished (changed=True) 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/CCOpFusion]: CCOpFusion finished after 0.304 seconds 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/StaticProfiler]: Running StaticProfiler 2025-09-05T19:19:07Z WARNING 1971 [sg0000/Tensorizer/StaticProfiler]: matmul-based transposes inserted by penguin takes up 100.00 percent of all matmul computation 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/StaticProfiler]: Finished (changed=False) 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/StaticProfiler]: StaticProfiler finished after 0.035 seconds 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/SplitAPUnionSets]: Running SplitAPUnionSets 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/SplitAPUnionSets]: Finished (changed=True) 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/SplitAPUnionSets]: SplitAPUnionSets finished after 0.130 seconds 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/LateLegalizePostSplit]: Running LateLegalizePostSplit 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/LateLegalizePostSplit]: Finished (changed=False) 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/LateLegalizePostSplit]: LateLegalizePostSplit finished after 0.032 seconds 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DumpGraphAndMetadata]: Running DumpGraphAndMetadata 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DumpGraphAndMetadata]: Finished (changed=False) 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/DumpGraphAndMetadata]: DumpGraphAndMetadata finished after 0.038 seconds 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/ZeroSizeTensorElimination]: Running ZeroSizeTensorElimination 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/ZeroSizeTensorElimination]: Finished (changed=False) 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/ZeroSizeTensorElimination]: ZeroSizeTensorElimination finished after 0.000 seconds 2025-09-05T19:19:07Z INFO 1971 [sg0000/Tensorizer/BirCodeGenLoop]: Running BirCodeGenLoop 2025-09-05T19:19:08Z INFO 1971 [sg0000/Tensorizer/BirCodeGenLoop]: Finished (changed=False) 2025-09-05T19:19:08Z INFO 1971 [sg0000/Tensorizer/BirCodeGenLoop]: BirCodeGenLoop finished after 0.618 seconds 2025-09-05T19:19:09Z INFO 1971 [Tensorizer]: BirCodeGen estimate #instances=285991 in sg0000 2025-09-05T19:19:09Z INFO 1971 [Tensorizer]: IR signature: 868d010c2395728dde8cb4de874491d6109b3553bfa434107fadd7604cffd131 for nc00/sg0000/TensorizerBIR 2025-09-05T19:19:09Z INFO 1971 [Tensorizer]: Weights total number of bytes: 131072 2025-09-05T19:19:09Z INFO 1971 [Tensorizer]: Successfully built model. 2025-09-05T19:19:09Z USER 1971 [root/Tensorizer/Tensorizer]: Tensorizer finished after 27.551 seconds 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: End tensorization 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input0 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input1 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input2 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input3 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input4 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input5 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input6 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input7 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input8 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input9 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input10 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input11 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input12 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input13 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input14 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input15 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input16 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input17 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input18 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input19 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input20 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input21 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input22 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input23 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input24 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input25 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input26 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input27 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input28 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input29 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input30 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input31 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input32 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input33 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input34 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input35 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input36 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input37 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input38 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input39 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input40 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input41 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input42 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input43 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input44 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input45 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input46 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input47 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input48 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input49 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input50 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input51 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input52 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input53 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input54 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input55 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input56 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input57 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input58 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input59 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input60 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input61 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input62 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input63 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input64 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input65 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input66 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input67 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input68 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input69 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input70 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input71 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input72 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input73 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input74 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input75 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input76 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input77 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input78 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input79 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input80 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input81 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input82 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input83 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input84 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input85 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input86 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input87 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input88 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input89 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input90 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input91 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input92 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input93 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input94 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input95 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input96 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input97 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input98 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input99 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input100 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input101 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input102 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input103 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input104 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input105 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input106 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input107 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input108 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input109 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input110 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input111 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input112 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input113 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input114 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input115 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input116 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input117 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input118 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input119 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input120 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input121 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input122 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input123 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input124 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input125 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input126 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input127 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input128 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input129 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input130 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input131 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input132 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input133 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input134 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input135 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input136 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input137 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input138 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input139 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input140 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input141 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input142 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input143 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input144 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input145 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input146 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input147 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input148 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input149 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input150 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input151 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input152 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input153 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input154 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input155 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input156 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input157 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input158 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input159 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input160 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input161 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input162 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input163 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input164 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input165 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input166 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input167 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input168 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input169 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input170 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input171 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input172 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input173 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input174 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input175 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input176 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input177 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input178 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input179 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input180 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input181 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input182 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input183 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input184 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input185 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input186 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input187 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input188 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input189 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input190 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input191 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input192 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input193 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input194 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input195 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input196 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input197 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input198 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input199 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input200 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input201 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input202 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input203 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input204 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input205 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input206 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input207 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input208 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input209 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input210 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input211 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input212 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input213 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input214 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input215 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input216 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input217 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input218 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input219 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input220 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input221 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input222 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input223 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input224 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input225 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input226 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input227 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input228 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input229 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input230 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input231 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input232 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input233 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input234 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input235 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input236 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input237 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input238 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input239 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input240 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input241 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input242 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input243 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input244 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input245 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input246 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input247 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input248 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input249 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input250 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input251 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input252 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input253 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input254 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input255 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input256 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input257 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input258 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input259 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input260 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input261 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input262 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input263 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input264 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input265 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input266 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input267 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input268 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input269 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input270 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input271 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input272 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input273 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input274 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input275 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input276 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input277 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input278 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input279 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input280 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input281 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input282 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input283 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input284 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input285 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input286 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input287 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input288 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input289 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Network input: input290 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: wrote bir.json 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: wrote tensor_map.json 2025-09-05T19:19:09Z INFO 1971 [job.Frontend.0]: Job #0 finished 2025-09-05T19:19:09Z INFO 1971 [pipeline.Pipeline.0]: Finished job job.Frontend.0 2025-09-05T19:19:09Z INFO 1971 [pipeline.Pipeline.0]: Starting job job.StaticIOTranspose.0 2025-09-05T19:19:09Z INFO 1971 [pipeline.Pipeline.0]: Finished job job.StaticIOTranspose.0 2025-09-05T19:19:09Z INFO 1971 [pipeline.Pipeline.0]: Starting job job.WalrusDriver.0 2025-09-05T19:19:09Z INFO 1971 [job.WalrusDriver.0]: BackendDriver has 1 states with 1 core LNC 2025-09-05T19:19:09Z INFO 1971 [job.WalrusDriver.0]: BackendDriver: no partitions found. Switching to flat flow. 2025-09-05T19:19:09Z INFO 1971 [job.WalrusDriver.0]: Job WalrusDriver len(in_states) 1 2025-09-05T19:19:09Z INFO 1971 [job.WalrusDriver.0]: Processing input #0 2025-09-05T19:19:09Z INFO 1971 [job.WalrusDriver.0]: BackendDriver in_state.num_states 1 with 1 core LNC 2025-09-05T19:19:09Z INFO 1971 [job.WalrusDriver.0]: Executing /opt/conda/lib/python3.10/site-packages/neuronxcc/starfish/bin/walrus_driver --optlevel 2 --allocator coloring --verbose 35 --logfile-verbose 20 --logfile /models/mistral-7b-v0.3-instruct-neuronx/layout_opt/log-neuron-cc.txt --execute-repetition 1 -i bir.json --min_split_size 10240 --skip_split_vns '' --no_split_dram --split_huge_dram_tensor 1.0 --preprocessing_only --max_tensorizer_distance 64 --pack_same_shape_only --instruction_fetch_latency 511 --max-partitions 1 --policy 3 --auxflag 0 --interleave none --schedule-delayed-latency 1 --postsched-mm-accum-reorder=false --max-load-color-rotation --max-load-lower-bound 0.14 --mm-reorder-opt --force-prefetch-follow-incoming-order -1 --allreduce-buffer-size 500 --dram-page-size 512 --dram-rotation-size -1 --allreduce-rotation-dis 8 --repeat-load-thres 4 --enable-mm-transpose-remat-optimization=true --save-len-thres 512 --save-dma-cnt-thres 32 --relaxed-order=true --enable-anti-dependence-reduction=false --num-semaphores-per-queue 16 --numcores 1 --act-root-json /opt/conda/lib/python3.10/site-packages/neuronxcc/pwp/pwp_bin_trainium/act_info.json --dve-root-json /opt/conda/lib/python3.10/site-packages/neuronxcc/dve/dve_bin_gen2/dve_info.json --unified-backend-and-legacy-codegen --tensor-map tensor_map.json --enable-verifier=true --enable-birsim=false --enable-birsim-sync-only=false --enable-data-race-checker=false --enable-new-backend=true --inject-error=NONE --dge-levels vector_dynamic_offsets,scalar_dynamic_offset,io --dynamic-dma-scratch-size-per-partition=16384 --neff-output-filename /models/mistral-7b-v0.3-instruct-neuronx/layout_opt/graph.neff 2025-09-05T19:19:09Z INFO 1971 [job.WalrusDriver.0]: Working directory is /opt/vllm/neuronxcc-gss0wi_w/sg00 2025-09-05T19:19:09Z INFO 1971 [job.WalrusDriver.0]: propagate_exit=True 2025-09-05T19:19:09Z INFO 1971 [job.WalrusDriver.0]: use_logger=False 2025-09-05T19:19:09Z INFO 1971 [job.WalrusDriver.0]: expose_stderr=True 2025-09-05T19:19:09Z INFO 2007 [Logging]: Logging to ../../../../models/mistral-7b-v0.3-instruct-neuronx/layout_opt/log-neuron-cc.txt at level 'INFO' 2025-09-05T19:19:09Z INFO 2007 [BackendDriver]: max_allowed_parallelism=32 2025-09-05T19:19:09Z INFO 2007 [BackendDriver]: Backend driver mtBackend: false numModules: 1 Cwd: "/opt/vllm/neuronxcc-gss0wi_w/sg00" 2025-09-05T19:19:09Z INFO 2007 [BackendDriver]: DynamicDMA is enabled 2025-09-05T19:19:09Z INFO 2007 [BackendDriver]: DynamicDMA levels being enabled: io, scalar_dynamic_offset, vector_dynamic_offsets, 2025-09-05T19:19:09Z USER 2007 [BackendPassManager]: Running mod_parallel_pass 2025-09-05T19:19:09Z INFO 2007 [BackendPassManager]: Inputs to mod_parallel_pass: modules=1 functions=1 allocs=1455 blocks=1 instructions=775 Max writers: 1 Max Readers: 290 2025-09-05T19:19:09Z USER 2007 [ModuleForkPass]: Running do_nothing 2025-09-05T19:19:09Z INFO 2007 [ModuleForkPass]: Inputs to do_nothing: modules=1 functions=1 allocs=1455 blocks=1 instructions=775 Max writers: 1 Max Readers: 290 2025-09-05T19:19:09Z USER 2007 [ModuleForkPass]: do_nothing finished after 0.002 seconds 2025-09-05T19:19:09Z INFO 2007 [ModuleForkPass]: curr_vmrss: 175mb, ru_maxrss: 419mb (delta=0mb) 2025-09-05T19:19:09Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 1455 memory location(s), 1 block(s), and 775 instruction(s). Max writers: 1 Max Readers: 290 2025-09-05T19:19:09Z USER 2007 [ModuleForkPass]: Running birverifier 2025-09-05T19:19:09Z INFO 2007 [ModuleForkPass]: Inputs to birverifier: modules=1 functions=1 allocs=1455 blocks=1 instructions=775 Max writers: 1 Max Readers: 290 2025-09-05T19:19:10Z USER 2007 [ModuleForkPass]: birverifier finished after 0.275 seconds 2025-09-05T19:19:10Z INFO 2007 [ModuleForkPass]: curr_vmrss: 961mb, ru_maxrss: 961mb (delta=542mb) 2025-09-05T19:19:10Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 1455 memory location(s), 1 block(s), and 775 instruction(s). Max writers: 1 Max Readers: 290 2025-09-05T19:19:10Z USER 2007 [BackendPassManager]: mod_parallel_pass finished after 0.282 seconds 2025-09-05T19:19:10Z INFO 2007 [BackendPassManager]: curr_vmrss: 961mb, ru_maxrss: 961mb (delta=542mb) 2025-09-05T19:19:10Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 1455 memory location(s), 1 block(s), and 775 instruction(s). Max writers: 1 Max Readers: 290 2025-09-05T19:19:10Z USER 2007 [BackendPassManager]: Running subgraph_parallel_pass 2025-09-05T19:19:10Z INFO 2007 [BackendPassManager]: Inputs to subgraph_parallel_pass: modules=1 functions=1 allocs=1455 blocks=1 instructions=775 Max writers: 1 Max Readers: 290 2025-09-05T19:19:10Z USER 2007 [SubgraphForkPass]: Running lnc_verifier 2025-09-05T19:19:10Z INFO 2007 [SubgraphForkPass]: Inputs to lnc_verifier: modules=1 functions=1 allocs=1455 blocks=1 instructions=775 Max writers: 1 Max Readers: 290 2025-09-05T19:19:10Z USER 2007 [SubgraphForkPass]: lnc_verifier finished after 0.000 seconds 2025-09-05T19:19:10Z INFO 2007 [SubgraphForkPass]: curr_vmrss: 961mb, ru_maxrss: 961mb (delta=0mb) 2025-09-05T19:19:10Z INFO 2007 [SubgraphForkPass]: Output has 1 module(s), 1 function(s), 1455 memory location(s), 1 block(s), and 775 instruction(s). Max writers: 1 Max Readers: 290 2025-09-05T19:19:10Z USER 2007 [BackendPassManager]: subgraph_parallel_pass finished after 0.001 seconds 2025-09-05T19:19:10Z INFO 2007 [BackendPassManager]: curr_vmrss: 961mb, ru_maxrss: 961mb (delta=0mb) 2025-09-05T19:19:10Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 1455 memory location(s), 1 block(s), and 775 instruction(s). Max writers: 1 Max Readers: 290 2025-09-05T19:19:10Z USER 2007 [BackendPassManager]: Running mod_parallel_pass 2025-09-05T19:19:10Z INFO 2007 [BackendPassManager]: Inputs to mod_parallel_pass: modules=1 functions=1 allocs=1455 blocks=1 instructions=775 Max writers: 1 Max Readers: 290 2025-09-05T19:19:10Z USER 2007 [ModuleForkPass]: Running expand_replication 2025-09-05T19:19:10Z INFO 2007 [ModuleForkPass]: Inputs to expand_replication: modules=1 functions=1 allocs=1455 blocks=1 instructions=775 Max writers: 1 Max Readers: 290 2025-09-05T19:19:10Z INFO 2007 [ExpandReplication]: Found 0 replicated matmults 2025-09-05T19:19:10Z USER 2007 [ModuleForkPass]: expand_replication finished after 0.001 seconds 2025-09-05T19:19:10Z INFO 2007 [ModuleForkPass]: curr_vmrss: 961mb, ru_maxrss: 961mb (delta=0mb) 2025-09-05T19:19:10Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 1455 memory location(s), 1 block(s), and 775 instruction(s). Max writers: 1 Max Readers: 290 2025-09-05T19:19:10Z USER 2007 [ModuleForkPass]: Running unroll 2025-09-05T19:19:10Z INFO 2007 [ModuleForkPass]: Inputs to unroll: modules=1 functions=1 allocs=1455 blocks=1 instructions=775 Max writers: 1 Max Readers: 290 2025-09-05T19:19:10Z INFO 2007 [Unroll]: INFO (Unroll) Start unrolling at Fri Sep 5 19:19:10 2025 2025-09-05T19:19:12Z INFO 2007 [Unroll]: INFO (Unroll) DONE unrolling Fri Sep 5 19:19:10 2025 2025-09-05T19:19:12Z INFO 2007 [Unroll]: sg0000 Instruction count after Unroll: 2025-09-05T19:19:12Z INFO 2007 [Unroll]: Total count: 285701 2025-09-05T19:19:12Z INFO 2007 [Unroll]: Matmult: 217153 2025-09-05T19:19:12Z INFO 2007 [Unroll]: GenericCopy: 54337 2025-09-05T19:19:12Z INFO 2007 [Unroll]: Load: 7106 2025-09-05T19:19:12Z INFO 2007 [Unroll]: Save: 7105 2025-09-05T19:19:12Z INFO 2007 [Unroll]: Unrolled DGE count with Dynamic AP: 0 2025-09-05T19:19:12Z USER 2007 [ModuleForkPass]: unroll finished after 2.471 seconds 2025-09-05T19:19:12Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2540mb, ru_maxrss: 2541mb (delta=1580mb) 2025-09-05T19:19:12Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69807 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:12Z USER 2007 [BackendPassManager]: mod_parallel_pass finished after 2.543 seconds 2025-09-05T19:19:12Z INFO 2007 [BackendPassManager]: curr_vmrss: 1679mb, ru_maxrss: 2541mb (delta=1580mb) 2025-09-05T19:19:12Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69807 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:12Z USER 2007 [BackendPassManager]: Running subgraph_parallel_pass 2025-09-05T19:19:12Z INFO 2007 [BackendPassManager]: Inputs to subgraph_parallel_pass: modules=1 functions=1 allocs=69807 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:12Z USER 2007 [SubgraphForkPass]: Running dead_code_elim 2025-09-05T19:19:12Z INFO 2007 [SubgraphForkPass]: Inputs to dead_code_elim: modules=1 functions=1 allocs=69807 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:12Z INFO 2007 [DeadCodeElim]: eliminateDeadStore removed 0 instructions 2025-09-05T19:19:12Z INFO 2007 [DeadCodeElim]: remove_must_alias_dmacopy removed 0 DMAcopys 2025-09-05T19:19:12Z INFO 2007 [DeadCodeElim]: remove_redundant_alias_dmacopy removed 0 DMAcopys 2025-09-05T19:19:12Z INFO 2007 [DeadCodeElim]: remove_redundant_internal2internal_dmacopy removed 0 DMAcopys 2025-09-05T19:19:12Z USER 2007 [SubgraphForkPass]: dead_code_elim finished after 0.257 seconds 2025-09-05T19:19:12Z INFO 2007 [SubgraphForkPass]: curr_vmrss: 1699mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:12Z INFO 2007 [SubgraphForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:12Z USER 2007 [BackendPassManager]: subgraph_parallel_pass finished after 0.266 seconds 2025-09-05T19:19:12Z INFO 2007 [BackendPassManager]: curr_vmrss: 1699mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:12Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:12Z USER 2007 [BackendPassManager]: Running mod_parallel_pass 2025-09-05T19:19:12Z INFO 2007 [BackendPassManager]: Inputs to mod_parallel_pass: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:12Z USER 2007 [ModuleForkPass]: Running birverifier 2025-09-05T19:19:12Z INFO 2007 [ModuleForkPass]: Inputs to birverifier: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: birverifier finished after 0.232 seconds 2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1753mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:13Z USER 2007 [BackendPassManager]: mod_parallel_pass finished after 0.239 seconds 2025-09-05T19:19:13Z INFO 2007 [BackendPassManager]: curr_vmrss: 1753mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:13Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:13Z USER 2007 [BackendPassManager]: Running subgraph_parallel_pass 2025-09-05T19:19:13Z INFO 2007 [BackendPassManager]: Inputs to subgraph_parallel_pass: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:13Z USER 2007 [SubgraphForkPass]: Running lnc_verifier 2025-09-05T19:19:13Z INFO 2007 [SubgraphForkPass]: Inputs to lnc_verifier: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:13Z USER 2007 [SubgraphForkPass]: lnc_verifier finished after 0.003 seconds 2025-09-05T19:19:13Z INFO 2007 [SubgraphForkPass]: curr_vmrss: 1753mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:13Z INFO 2007 [SubgraphForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:13Z USER 2007 [BackendPassManager]: subgraph_parallel_pass finished after 0.010 seconds 2025-09-05T19:19:13Z INFO 2007 [BackendPassManager]: curr_vmrss: 1753mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:13Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:13Z USER 2007 [BackendPassManager]: Running mod_parallel_pass 2025-09-05T19:19:13Z INFO 2007 [BackendPassManager]: Inputs to mod_parallel_pass: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: Running instruction_reorder 2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Inputs to instruction_reorder: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: instruction_reorder finished after 0.027 seconds 2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1753mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: Running psum_legalization 2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Inputs to psum_legalization: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: psum_legalization finished after 0.031 seconds 2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1753mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: Running legalize_cce_dma 2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Inputs to legalize_cce_dma: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: legalize_cce_dma finished after 0.014 seconds 2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1753mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: Running error_injector 2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Inputs to error_injector: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:13Z WARNING 2007 [ErrorInjector]: Unrecognized injected error value "0" 2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: error_injector finished after 0.003 seconds 2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1753mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: Running vn_splitter 2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Inputs to vn_splitter: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:13Z INFO 2007 [VNSplitter]: INFO (VNSplitter) Collected all the internal vnodes: size = 0 2025-09-05T19:19:13Z INFO 2007 [VNSplitter]: INFO (VNSplitter) Done with analyze and splitting: total dead nodes = 0 2025-09-05T19:19:13Z INFO 2007 [PerformanceProfiler]: number of tensorizer non-local-tensor caused reload left 0 2025-09-05T19:19:13Z INFO 2007 [PerformanceProfiler]: number of tensorizer non-local-tensor caused spill left 0 2025-09-05T19:19:13Z INFO 2007 [VNSplitterPass]: INFO (VNSplitter) Time: 0.004 seconds 2025-09-05T19:19:13Z INFO 2007 [VNSplitterPass]: INFO (VerticalFusion) Time: 0.073 seconds 2025-09-05T19:19:13Z INFO 2007 [VNSplitterPass]: INFO (ShrinkDN) Time: 0.075 seconds 2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: vn_splitter finished after 0.186 seconds 2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1764mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: Running constant_propagate 2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Inputs to constant_propagate: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:13Z INFO 2007 [ConstantPropagate]: [Constant_propagate for select] directly remove instruction number: 0 2025-09-05T19:19:13Z INFO 2007 [ConstantPropagate]: eliminateDeadStore removed 0 instructions 2025-09-05T19:19:13Z INFO 2007 [ConstantPropagate]: remove_must_alias_dmacopy removed 0 DMAcopys 2025-09-05T19:19:13Z INFO 2007 [ConstantPropagate]: remove_redundant_alias_dmacopy removed 0 DMAcopys 2025-09-05T19:19:13Z INFO 2007 [ConstantPropagate]: remove_redundant_internal2internal_dmacopy removed 0 DMAcopys 2025-09-05T19:19:13Z INFO 2007 [ConstantPropagate]: [Constant_propagate for Affineselect] directly remove instruction number: 0 2025-09-05T19:19:13Z INFO 2007 [ConstantPropagate]: eliminateDeadStore removed 0 instructions 2025-09-05T19:19:13Z INFO 2007 [ConstantPropagate]: remove_must_alias_dmacopy removed 0 DMAcopys 2025-09-05T19:19:13Z INFO 2007 [ConstantPropagate]: remove_redundant_alias_dmacopy removed 0 DMAcopys 2025-09-05T19:19:13Z INFO 2007 [ConstantPropagate]: remove_redundant_internal2internal_dmacopy removed 0 DMAcopys 2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: constant_propagate finished after 0.482 seconds 2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1768mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:13Z USER 2007 [ModuleForkPass]: Running lower_ac 2025-09-05T19:19:13Z INFO 2007 [ModuleForkPass]: Inputs to lower_ac: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:14Z INFO 2007 [LowerAC]: INFO (LowerAC) Lowered 0 loads, 0 saves, 0 copies. 2025-09-05T19:19:14Z USER 2007 [ModuleForkPass]: lower_ac finished after 0.026 seconds 2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1768mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:14Z USER 2007 [ModuleForkPass]: Running input_dma_coalescing 2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: Inputs to input_dma_coalescing: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:14Z INFO 2007 [DMAOptimizationBase]: DMA input Coalescing combined 0 input loads 2025-09-05T19:19:14Z USER 2007 [ModuleForkPass]: input_dma_coalescing finished after 0.078 seconds 2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1768mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:14Z USER 2007 [ModuleForkPass]: Running remat_optimization 2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: Inputs to remat_optimization: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:14Z INFO 2007 [RematOpt]: Removed 0 remat instructions 2025-09-05T19:19:14Z USER 2007 [ModuleForkPass]: remat_optimization finished after 0.174 seconds 2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1772mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:14Z USER 2007 [ModuleForkPass]: Running early_peephole_opts 2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: Inputs to early_peephole_opts: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:14Z INFO 2007 [EarlyPeepholeOpts]: PeepholeOpts enabled? ActivationAccumulate: true 2025-09-05T19:19:14Z INFO 2007 [EarlyPeepholeOpts]: Activation Accumulate: 0 2025-09-05T19:19:14Z USER 2007 [ModuleForkPass]: early_peephole_opts finished after 0.048 seconds 2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1772mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:14Z USER 2007 [ModuleForkPass]: Running coalesce_multichannel_cc_ops 2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: Inputs to coalesce_multichannel_cc_ops: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:14Z USER 2007 [ModuleForkPass]: coalesce_multichannel_cc_ops finished after 0.015 seconds 2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1772mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:14Z USER 2007 [ModuleForkPass]: Running infer_stream_ids 2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: Inputs to infer_stream_ids: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:14Z USER 2007 [ModuleForkPass]: infer_stream_ids finished after 0.015 seconds 2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1772mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:14Z USER 2007 [ModuleForkPass]: Running pre_sched 2025-09-05T19:19:14Z INFO 2007 [ModuleForkPass]: Inputs to pre_sched: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:14Z INFO 2007 [PreSched]: Start PRE scheduling 2 cores: 1 at: Fri Sep 5 19:19:14 2025 2025-09-05T19:19:14Z INFO 2007 [LayerSpiller]: LayerSpill: Start... 2025-09-05T19:19:14Z INFO 2007 [LayerSpiller]: LayerSpill: Found 0 Splits CCs 2025-09-05T19:19:14Z INFO 2007 [LayerSpiller]: Grouped CCs to 0 clusters. 2025-09-05T19:19:14Z INFO 2007 [LayerSpiller]: LayerSpill: To Spill 0 multi-layer tensors 2025-09-05T19:19:14Z INFO 2007 [LayerSpiller]: LayerSpill: set uninit flag on 0 insts 2025-09-05T19:19:14Z INFO 2007 [LayerSpiller]: LayerSpill: Done. 2025-09-05T19:19:14Z INFO 2007 [PreSched]: Start split live ranges Fri Sep 5 19:19:14 2025 2025-09-05T19:19:14Z INFO 2007 [PreSched]: Num_Splits: 0 2025-09-05T19:19:14Z INFO 2007 [PreSched]: End split live ranges Fri Sep 5 19:19:14 2025 2025-09-05T19:19:14Z INFO 2007 [PreSched]: Strt remove redundncies Fri Sep 5 19:19:14 2025 2025-09-05T19:19:14Z INFO 2007 [PreSched]: remove_redundant_memsets 2025-09-05T19:19:14Z INFO 2007 [PreSched]: remove_redundant_memsets: 0 2025-09-05T19:19:14Z INFO 2007 [PreSched]: remove_redundant_loads 2025-09-05T19:19:14Z INFO 2007 [PreSched]: remove_redundant_loads: 0 2025-09-05T19:19:14Z INFO 2007 [PreSched]: End remove redundncies Fri Sep 5 19:19:14 2025 2025-09-05T19:19:14Z INFO 2007 [PreSched]: Start DCE Fri Sep 5 19:19:14 2025 2025-09-05T19:19:14Z INFO 2007 [PreSched]: eliminateDeadStore removed 0 instructions 2025-09-05T19:19:14Z INFO 2007 [PreSched]: remove_must_alias_dmacopy removed 0 DMAcopys 2025-09-05T19:19:14Z INFO 2007 [PreSched]: remove_redundant_alias_dmacopy removed 0 DMAcopys 2025-09-05T19:19:14Z INFO 2007 [PreSched]: remove_redundant_internal2internal_dmacopy removed 0 DMAcopys 2025-09-05T19:19:14Z INFO 2007 [PreSched]: End DCE Fri Sep 5 19:19:14 2025 2025-09-05T19:19:14Z INFO 2007 [PreSched]: Start build flow dependencies Fri Sep 5 19:19:14 2025 2025-09-05T19:19:14Z INFO 2007 [build_flow_deps]: Start build fdeps. Invocation: 1Fri Sep 5 19:19:14 2025 2025-09-05T19:19:14Z INFO 2007 [build_flow_deps]: Allocs: 69132 instructions: 285701 2025-09-05T19:19:15Z INFO 2007 [build_flow_deps]: Build fdeps inserted 714949 edges 2025-09-05T19:19:15Z INFO 2007 [build_flow_deps]: Done build fdeps 714949 Fri Sep 5 19:19:15 2025 2025-09-05T19:19:15Z INFO 2007 [PreSched]: End build flow dependencies Fri Sep 5 19:19:15 2025 2025-09-05T19:19:15Z INFO 2007 [PreSched]: Start remove useless insts Fri Sep 5 19:19:15 2025 2025-09-05T19:19:15Z INFO 2007 [PreSched]: remove_useless_insts 2025-09-05T19:19:16Z INFO 2007 [PreSched]: remove Useless Instructions: 0 2025-09-05T19:19:16Z INFO 2007 [PreSched]: End remove useless insts Fri Sep 5 19:19:16 2025 2025-09-05T19:19:16Z INFO 2007 [PreSched]: Start scratchpad optimization Fri Sep 5 19:19:16 2025 2025-09-05T19:19:16Z INFO 2007 [PreSched]: End scratchpad optimization Fri Sep 5 19:19:16 2025 2025-09-05T19:19:16Z INFO 2007 [PreSched]: DONE PRE scheduling Fri Sep 5 19:19:16 2025 2025-09-05T19:19:16Z USER 2007 [ModuleForkPass]: pre_sched finished after 1.914 seconds 2025-09-05T19:19:16Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1884mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:16Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:16Z USER 2007 [ModuleForkPass]: Running tensor_copy_elim 2025-09-05T19:19:16Z INFO 2007 [ModuleForkPass]: Inputs to tensor_copy_elim: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:16Z INFO 2007 [TensorCopyElim]: Tensor CP elimination: 0 2025-09-05T19:19:16Z INFO 2007 [TensorCopyElim]: eliminateDeadStore removed 0 instructions 2025-09-05T19:19:16Z INFO 2007 [TensorCopyElim]: remove_must_alias_dmacopy removed 0 DMAcopys 2025-09-05T19:19:16Z INFO 2007 [TensorCopyElim]: remove_redundant_alias_dmacopy removed 0 DMAcopys 2025-09-05T19:19:16Z INFO 2007 [TensorCopyElim]: remove_redundant_internal2internal_dmacopy removed 0 DMAcopys 2025-09-05T19:19:16Z USER 2007 [ModuleForkPass]: tensor_copy_elim finished after 0.341 seconds 2025-09-05T19:19:16Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1884mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:16Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:16Z USER 2007 [ModuleForkPass]: Running dynamic_dma_setup 2025-09-05T19:19:16Z INFO 2007 [ModuleForkPass]: Inputs to dynamic_dma_setup: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:16Z USER 2007 [ModuleForkPass]: dynamic_dma_setup finished after 0.004 seconds 2025-09-05T19:19:16Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1884mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:16Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69133 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:16Z USER 2007 [ModuleForkPass]: Running runtime_memory_reservation 2025-09-05T19:19:16Z INFO 2007 [ModuleForkPass]: Inputs to runtime_memory_reservation: modules=1 functions=1 allocs=69133 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:16Z USER 2007 [ModuleForkPass]: runtime_memory_reservation finished after 0.004 seconds 2025-09-05T19:19:16Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1884mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:16Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69133 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:16Z USER 2007 [ModuleForkPass]: Running coloring_allocator_psum 2025-09-05T19:19:16Z INFO 2007 [ModuleForkPass]: Inputs to coloring_allocator_psum: modules=1 functions=1 allocs=69133 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:16Z INFO 2007 [ColoringAllocator::Rep]: Allocating functions 2025-09-05T19:19:16Z INFO 2007 [ColoringAllocator::Rep]: linearize and check 2025-09-05T19:19:16Z INFO 2007 [PSUM_Allocator]: allocating PSUM 2025-09-05T19:19:16Z INFO 2007 [PSUM_Allocator]: main loop 2025-09-05T19:19:16Z INFO 2007 [PSUM_Allocator]: renumber locations 2025-09-05T19:19:16Z INFO 2007 [PSUM_Allocator]: size = 54337 2025-09-05T19:19:16Z INFO 2007 [PSUM_Allocator]: build_no_bitmap start 2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: 100% PSUM demand before spilling 2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: PSUM high-water mark = 8 tensors 2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: found 182784 edges 2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: mean: 6.72779 2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: median: 6.99997 2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: adjacency vectors require 1462272 bytes 2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: build_no_bitmap done 2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: find costs 2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: best-of-n loop, heuristic = 0, allow_psum_spill_within_accum_group = false 2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: simplify interference graph 2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: initialize low and high 2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: lo = 54337 2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: hi = 0 2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: inf = 0 2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: total = 54337 2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: simplify 2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: new candidates = 0 2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: select ranges 2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: no more spills 2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: PSUM score = 0 (lower is better) 2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: spilling from PSUM cost about 0 cycles 2025-09-05T19:19:17Z INFO 2007 [PSUM_Allocator]: 100% PSUM utilization after allocation 2025-09-05T19:19:17Z USER 2007 [ModuleForkPass]: coloring_allocator_psum finished after 0.597 seconds 2025-09-05T19:19:17Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1901mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:17Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69133 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:17Z USER 2007 [ModuleForkPass]: Running dma_optimization_psum 2025-09-05T19:19:17Z INFO 2007 [ModuleForkPass]: Inputs to dma_optimization_psum: modules=1 functions=1 allocs=69133 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:17Z INFO 2007 [DMAOptimizationBase]: [psum spill optimization]: removed 0 spill/reload instructions 2025-09-05T19:19:17Z INFO 2007 [DMAOptimizationBase]: [psum spill optimization]: removed 0 spill/reload memory locations 2025-09-05T19:19:17Z USER 2007 [ModuleForkPass]: dma_optimization_psum finished after 0.203 seconds 2025-09-05T19:19:17Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1901mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:17Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69133 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:17Z USER 2007 [ModuleForkPass]: Running address_rotation_psum 2025-09-05T19:19:17Z INFO 2007 [ModuleForkPass]: Inputs to address_rotation_psum: modules=1 functions=1 allocs=69133 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:18Z INFO 2007 [DMAOptimizationBase]: PSUM Rotation rotated 0 PSUM Banks 2025-09-05T19:19:18Z INFO 2007 [DMAOptimizationBase]: PSUM Rotation rotated 0 PSUM Banks 2025-09-05T19:19:19Z INFO 2007 [DMAOptimizationBase]: PSUM Rotation rotated 0 PSUM Banks 2025-09-05T19:19:19Z USER 2007 [ModuleForkPass]: address_rotation_psum finished after 1.984 seconds 2025-09-05T19:19:19Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1907mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:19Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69133 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:19Z USER 2007 [ModuleForkPass]: Running coloring_allocator_sb 2025-09-05T19:19:19Z INFO 2007 [ModuleForkPass]: Inputs to coloring_allocator_sb: modules=1 functions=1 allocs=69133 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:19Z INFO 2007 [ColoringAllocator::Rep]: INFO: Pre GCA DRAM bytes loaded 7114104832 2025-09-05T19:19:19Z INFO 2007 [ColoringAllocator::Rep]: INFO: Pre GCA average loaded DMA size 7875 bytes 2025-09-05T19:19:19Z INFO 2007 [ColoringAllocator::Rep]: INFO: Pre GCA DRAM bytes saved 7114072064 2025-09-05T19:19:19Z INFO 2007 [ColoringAllocator::Rep]: INFO: Pre GCA average saved DMA size 7822 bytes 2025-09-05T19:19:19Z INFO 2007 [ColoringAllocator::Rep]: INFO: Post GCA DRAM bytes DMACopyed 0 2025-09-05T19:19:19Z INFO 2007 [ColoringAllocator::Rep]: INFO: Post GCA average DMACopyed DMA size 0 bytes 2025-09-05T19:19:19Z INFO 2007 [ColoringAllocator::Rep]: Allocating functions 2025-09-05T19:19:19Z INFO 2007 [ColoringAllocator::Rep]: linearize and check 2025-09-05T19:19:19Z INFO 2007 [SB_Allocator]: allocating SB 2025-09-05T19:19:19Z INFO 2007 [SB_Allocator]: main loop 2025-09-05T19:19:19Z INFO 2007 [SB_Allocator]: renumber locations 2025-09-05T19:19:19Z INFO 2007 [SB_Allocator]: size = 14212 2025-09-05T19:19:19Z INFO 2007 [SB_Allocator]: find partners 2025-09-05T19:19:19Z INFO 2007 [SB_Allocator]: found 54337 accumulation groups 2025-09-05T19:19:19Z INFO 2007 [SB_Allocator]: largest = 19629.23884_i1022 2025-09-05T19:19:19Z INFO 2007 [SB_Allocator]: tensors = 2 2025-09-05T19:19:19Z INFO 2007 [SB_Allocator]: requires 8448 bytes/partition 2025-09-05T19:19:19Z INFO 2007 [SB_Allocator]: expanding partners 2025-09-05T19:19:19Z INFO 2007 []: find first defs for local 2025-09-05T19:19:19Z INFO 2007 []: find first defs for global 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: find loads 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: 1 pin count 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: 6081 remat count 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: 1 pinned tensors will require about 16384 bytes/partition 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: build interference graph 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: pass 1 int-tree 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Num intervals 14212 Num locations 14212 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: IntervalTree Build Done 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: info.neighbors init Done 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: info.neighbors partners Done 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: IntervalTree readback Done 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: edge: 31236 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: mean: 4.39572 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: median: 2.00087 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: find costs 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: best-of-n loop, heuristic = 0 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: simplify interference graph 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: initialize safe & unsafe 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: safe = 14210 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: unsafe = 1 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: inf = 0 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: total = 14211 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: simplify 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: simplify_step3_sorted2 #Unsafe 0 #Pinned 0 #Safe 0 minCost 1.79769e+308 maxCost 2.22507e-308 locations 14212 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: new candidates = 0 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: select ranges 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Total: 14211 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Spilled: 0.000 (0) 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Allocated: 1.000 (14211) 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Rover zone: 0.989 (14051) 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Pre-rover zone: 0.009 (128) 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Post-rover zone: 0.002 (32) 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Slice zone: 0.000 (0) 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Blocks nothing: 0.000 (0) 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Blocks medium: 0.000 (0) 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Blocks tall: 1.000 (14211) 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Visited until tall blocking (mean): 0.997 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Visited until tall blocking (median): 1.000 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Visited until tall blocking (p95): 1.000 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: Success 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: SB spills = 0 tensors 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: size = 0 bytes/partition 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: remats = 0 tensors 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: unpinned = 0 tensors 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: size = 0 bytes/partition 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: SB score = 0 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: spilling from SB cost about 0 cycles 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: 16384 bytes/partition (100%) successfully pinned 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: pinning saved approximately 9010 cycles 2025-09-05T19:19:20Z INFO 2007 [SB_Allocator]: 0% SB utilization after allocation 2025-09-05T19:19:20Z INFO 2007 [ColoringAllocator::Rep]: INFO: Post GCA DRAM bytes loaded 7114104832 2025-09-05T19:19:20Z INFO 2007 [ColoringAllocator::Rep]: INFO: Post GCA average loaded DMA size 7875 bytes 2025-09-05T19:19:20Z INFO 2007 [ColoringAllocator::Rep]: INFO: Post GCA DRAM bytes saved 7114072064 2025-09-05T19:19:20Z INFO 2007 [ColoringAllocator::Rep]: INFO: Post GCA average saved DMA size 7822 bytes 2025-09-05T19:19:20Z INFO 2007 [ColoringAllocator::Rep]: INFO: Post GCA DRAM bytes DMACopyed 0 2025-09-05T19:19:20Z INFO 2007 [ColoringAllocator::Rep]: INFO: Post GCA average DMACopyed DMA size 0 bytes 2025-09-05T19:19:20Z USER 2007 [ModuleForkPass]: coloring_allocator_sb finished after 1.103 seconds 2025-09-05T19:19:20Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1911mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:20Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69133 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:20Z USER 2007 [ModuleForkPass]: Running address_rotation_sb 2025-09-05T19:19:20Z INFO 2007 [ModuleForkPass]: Inputs to address_rotation_sb: modules=1 functions=1 allocs=69133 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:20Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 0 Sb address 2025-09-05T19:19:20Z USER 2007 [ModuleForkPass]: address_rotation_sb finished after 0.290 seconds 2025-09-05T19:19:20Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1912mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:20Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69133 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:20Z USER 2007 [ModuleForkPass]: Running dma_optimization_sb 2025-09-05T19:19:20Z INFO 2007 [ModuleForkPass]: Inputs to dma_optimization_sb: modules=1 functions=1 allocs=69133 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:20Z INFO 2007 [DMAOptimizationBase]: DMA optimization In bytes loaded or saved 14228176896, 50.0001% input load, 49.9999% output write, 0% spill/reload [sg0000] 2025-09-05T19:19:20Z INFO 2007 [DMAOptimizationBase]: [DMA optimization]Reload_just_for_save Optimization removed 0 memlocs 2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: removed 0 identical load 2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: adjusted 0 DMACopy remat 2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: adjusted 0 DMACopy remat 2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: sub-graph will get execute 1 times 2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [Load Merging]: removed 0 remat/cloned instructions 2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [Load shrink]: shrinked 0 GCA remat/cloned instructions 2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [Load Merging + Load shrink] reduced input/const loading DMA traffic 0, 0% out of total dma traffic(7.1141e+09) 2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [spill optimization round 0]: removed 0 spill/reload instructions 2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [spill optimization round 0]: removed 0 spill/reload memory locations 2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [Spill Optimization] reduced DMA traffic 0, -nan% out of total spill/reload dma traffic 2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [Allocation optimization]: removed 0 spill/reload instructions 2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [Allocation optimization]: removed 0 spill/reload memory locations 2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [Re-allocation Optimization] reduced DMA traffic 0, -nan% out of total spill/reload dma traffic 2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [spill optimization round 0]: removed 0 spill/reload instructions 2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [spill optimization round 0]: removed 0 spill/reload memory locations 2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [Spill Optimization] reduced DMA traffic 0, -nan% out of total spill/reload dma traffic 2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [remove extra save] removed 0 memlocs and 0 instructions 2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [remove_memset_spill]: removed 0 spill/reload instructions 2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: [remove_memset_spill]: removed 0 spill/reload memory locations 2025-09-05T19:19:21Z INFO 2007 [DMAOptimizationBase]: eliminateDeadStore removed 0 instructions 2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: DMA SpillSave Coalescing Round 0 combined 0 SpillSaves and Reloads 2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: average loaded DMA size 7875 bytes 2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: average saved DMA size 7822 bytes 2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: INFO: Post DMA coalescing DRAM bytes loaded 7114104832 2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: INFO: Post DMA coalescing average loaded DMA size 7875 bytes 2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: INFO: Post DMA coalescing DRAM bytes saved 7114072064 2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: INFO: Post DMA coalescing average saved DMA size 7822 bytes 2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: [DMA optimization]Reload_just_for_save Optimization removed 0 memlocs 2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: [Experiment partial DMA access] reduced DMA traffic 0, -nan% out of total spill/reload dma traffic 2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: [DMA optimization] reduced DMA traffic 0, 0% out of total dma traffic 2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: DMA optimization Out bytes loaded or saved 14228176896, 50.0001% input load, 49.9999% output write, 0% spill/reload [sg0000] 2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: INFO: Post DMA optimization DRAM bytes loaded 7114104832 2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: INFO: Post DMA optimization average loaded DMA size 7875 bytes 2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: INFO: Post DMA optimization DRAM bytes saved 7114072064 2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: INFO: Post DMA optimization average saved DMA size 7822 bytes 2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: INFO: Post DMA optimization DRAM bytes DMAcopyed 0 2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: INFO: Post DMA optimization average DMAcopyed DMA size 0 bytes 2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: INFO: Post DMA optimization average DMA size 7848 bytes 2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: INFO: Finished set_spill_canreadUninit(module); 2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: DMA optimization re-enable optimization 2025-09-05T19:19:22Z USER 2007 [ModuleForkPass]: dma_optimization_sb finished after 1.621 seconds 2025-09-05T19:19:22Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1932mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:22Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:22Z USER 2007 [ModuleForkPass]: Running address_rotation_sb 2025-09-05T19:19:22Z INFO 2007 [ModuleForkPass]: Inputs to address_rotation_sb: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:22Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 5919 Sb address 2025-09-05T19:19:23Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 4699 Sb address 2025-09-05T19:19:23Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 0 Sb address 2025-09-05T19:19:23Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 0 Sb address 2025-09-05T19:19:23Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 1769 Sb address 2025-09-05T19:19:24Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 0 Sb address 2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: address_rotation_sb finished after 1.665 seconds 2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1932mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: Running coloring_allocator_dram 2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Inputs to coloring_allocator_dram: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:24Z INFO 2007 [ColoringAllocator::Rep]: Allocating functions 2025-09-05T19:19:24Z INFO 2007 [ColoringAllocator::Rep]: linearize and check 2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: allocating spills in DRAM pre_link mode for address space Local 2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: reserved space = 14496612352 bytes 2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: spill space = 0 bytes 2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: aligned spill space = 0 bytes 2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: dram space = 107374182400 bytes 2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: renumber locations 2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: size = 0 2025-09-05T19:19:24Z INFO 2007 []: find first defs for local 2025-09-05T19:19:24Z INFO 2007 []: find first defs for global 2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: Num intervals 0 Num locations 0 2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: IntervalTree Build Done 2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: info.neighbors init Done 2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: IntervalTree readback Done 2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: simplify interference graph 2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: initialize low and high 2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: lo = 0 2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: hi = 0 2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: total = 0 2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: simplify 2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: new candidates = 0 2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: select ranges 2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: CC buffer size limit 524288000 2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: allreduce_dram_hwm 0 2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: Real CC buffer size 0 2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: DRAM hwm after allocation: 0 2025-09-05T19:19:24Z INFO 2007 [DRAM_Allocator]: DRAM allocation successful 2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: coloring_allocator_dram finished after 0.469 seconds 2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1937mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: Running address_rotation_dram 2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Inputs to address_rotation_dram: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:24Z INFO 2007 [DMAOptimizationBase]: Runtime page size at 512MB 2025-09-05T19:19:24Z INFO 2007 [DMAOptimizationBase]: DRAM hwm before rotation 0 2025-09-05T19:19:24Z INFO 2007 [DMAOptimizationBase]: allreduce buffer size 524288000 2025-09-05T19:19:24Z INFO 2007 [DMAOptimizationBase]: allreduce hwm 0 2025-09-05T19:19:24Z INFO 2007 [DMAOptimizationBase]: Real CC buffer size 0 2025-09-05T19:19:24Z INFO 2007 [DMAOptimizationBase]: DRAM hwm after rotation 0 2025-09-05T19:19:24Z INFO 2007 [DMAOptimizationBase]: DRAM Rotation rotated 0 Dram address 2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: address_rotation_dram finished after 0.171 seconds 2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1939mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: Running tensorcopy_accel 2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Inputs to tensorcopy_accel: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:24Z INFO 2007 [TensorCopyAccel::Impl]: Running peephole optimization pass 2025-09-05T19:19:24Z INFO 2007 [TensorCopyAccel::Impl]: Accelerated 0 out of 54337 tensorcopy in Function: sg0000 average acceleration factor: -nan 2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: tensorcopy_accel finished after 0.025 seconds 2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1939mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: Running peephole_opts 2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Inputs to peephole_opts: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:24Z INFO 2007 [PeepholeOpts]: PeepholeOpts enabled? Recip: true Tsp: true Tc: false SplitSelect: true SimplifyMemset true 2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: peephole_opts finished after 0.055 seconds 2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1939mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: Running lower_kernel 2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Inputs to lower_kernel: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:24Z INFO 2007 [LowerKernel]: Started running LowerKernel 2025-09-05T19:19:24Z INFO 2007 [LowerKernel]: Start of kernel lowering pass, number of insts: 285701, number of allocs: 69132 2025-09-05T19:19:24Z INFO 2007 [LowerKernel]: Scan BKs time (s): 0.013202 2025-09-05T19:19:24Z INFO 2007 [LowerKernel]: Lower BKs time (s): 3e-06 2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: lower_kernel finished after 0.019 seconds 2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1939mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: Running lower_nki_kernel 2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Inputs to lower_nki_kernel: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: lower_nki_kernel finished after 0.017 seconds 2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1939mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: Running dynamic_dma_cleanup 2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Inputs to dynamic_dma_cleanup: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: dynamic_dma_cleanup finished after 0.029 seconds 2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1941mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:24Z USER 2007 [ModuleForkPass]: Running birverifier 2025-09-05T19:19:24Z INFO 2007 [ModuleForkPass]: Inputs to birverifier: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:25Z USER 2007 [ModuleForkPass]: birverifier finished after 0.195 seconds 2025-09-05T19:19:25Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1941mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:25Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:25Z USER 2007 [ModuleForkPass]: Running dynamic_dma_scan 2025-09-05T19:19:25Z INFO 2007 [ModuleForkPass]: Inputs to dynamic_dma_scan: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:25Z USER 2007 [ModuleForkPass]: dynamic_dma_scan finished after 0.027 seconds 2025-09-05T19:19:25Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1941mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:25Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:25Z USER 2007 [ModuleForkPass]: Running build_fdeps 2025-09-05T19:19:25Z INFO 2007 [ModuleForkPass]: Inputs to build_fdeps: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:25Z INFO 2007 [build_flow_deps]: Start build fdeps. Invocation: 2Fri Sep 5 19:19:25 2025 2025-09-05T19:19:25Z INFO 2007 [build_flow_deps]: Allocs: 69132 instructions: 285701 2025-09-05T19:19:26Z INFO 2007 [build_flow_deps]: Build fdeps inserted 714949 edges 2025-09-05T19:19:26Z INFO 2007 [build_flow_deps]: Done build fdeps 714949 Fri Sep 5 19:19:26 2025 2025-09-05T19:19:26Z USER 2007 [ModuleForkPass]: build_fdeps finished after 1.075 seconds 2025-09-05T19:19:26Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1963mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:26Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:26Z USER 2007 [ModuleForkPass]: Running remove_redundancies 2025-09-05T19:19:26Z INFO 2007 [ModuleForkPass]: Inputs to remove_redundancies: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:26Z INFO 2007 [RemoveRedundancies]: remove_clobbered_writes 2025-09-05T19:19:26Z INFO 2007 [RemoveRedundancies]: remove_clobbered_writes: 0 2025-09-05T19:19:26Z INFO 2007 [RemoveRedundancies]: remove_useless_insts 2025-09-05T19:19:26Z INFO 2007 [RemoveRedundancies]: remove Useless Instructions: 0 2025-09-05T19:19:26Z USER 2007 [ModuleForkPass]: remove_redundancies finished after 0.090 seconds 2025-09-05T19:19:26Z INFO 2007 [ModuleForkPass]: curr_vmrss: 1963mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:26Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:26Z USER 2007 [ModuleForkPass]: Running anti_dependency_analyzer 2025-09-05T19:19:26Z INFO 2007 [ModuleForkPass]: Inputs to anti_dependency_analyzer: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:26Z INFO 2007 [AntiDependencyAnalyzer]: Batch size: 1000 2025-09-05T19:19:26Z INFO 2007 [AntiDependencyAnalyzer]: Analysis types: {DRAM,ALIAS,PSUM,SB} 2025-09-05T19:19:26Z INFO 2007 [AntiDependencyAnalyzer]: DRAM size: 17179869184 num-bins: 16 bin-size: 1073741824 2025-09-05T19:19:27Z USER 2007 [ModuleForkPass]: anti_dependency_analyzer finished after 0.989 seconds 2025-09-05T19:19:27Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2058mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:27Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:27Z USER 2007 [ModuleForkPass]: Running tensor_copy_elim 2025-09-05T19:19:27Z INFO 2007 [ModuleForkPass]: Inputs to tensor_copy_elim: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:27Z INFO 2007 [TensorCopyElim]: Tensor CP elimination: 0 2025-09-05T19:19:27Z INFO 2007 [TensorCopyElim]: eliminateDeadStore removed 0 instructions 2025-09-05T19:19:27Z USER 2007 [ModuleForkPass]: tensor_copy_elim finished after 0.240 seconds 2025-09-05T19:19:27Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2071mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:27Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:27Z USER 2007 [ModuleForkPass]: Running prefetch_scheduling_before_sched 2025-09-05T19:19:27Z INFO 2007 [ModuleForkPass]: Inputs to prefetch_scheduling_before_sched: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:27Z USER 2007 [ModuleForkPass]: prefetch_scheduling_before_sched finished after 0.003 seconds 2025-09-05T19:19:27Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2071mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:27Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:27Z USER 2007 [ModuleForkPass]: Running post_sched 2025-09-05T19:19:27Z INFO 2007 [ModuleForkPass]: Inputs to post_sched: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:27Z INFO 2007 [post_scheduler]: Start PosT ScheD 3 sunda Fri Sep 5 19:19:27 2025 2025-09-05T19:19:29Z INFO 2007 [post_scheduler]: Time-aware hwm post-sched 2025-09-05T19:19:31Z INFO 2007 [post_scheduler]: Time-aware simulation time: 59414448 2025-09-05T19:19:32Z INFO 2007 [post_scheduler]: Done PosT ScheD Fri Sep 5 19:19:32 2025 2025-09-05T19:19:32Z USER 2007 [ModuleForkPass]: post_sched finished after 4.715 seconds 2025-09-05T19:19:32Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2468mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:32Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:32Z USER 2007 [ModuleForkPass]: Running expand_scheduling_units 2025-09-05T19:19:32Z INFO 2007 [ModuleForkPass]: Inputs to expand_scheduling_units: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:32Z USER 2007 [ModuleForkPass]: expand_scheduling_units finished after 0.027 seconds 2025-09-05T19:19:32Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2218mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:32Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:32Z USER 2007 [ModuleForkPass]: Running address_rotation_sb 2025-09-05T19:19:32Z INFO 2007 [ModuleForkPass]: Inputs to address_rotation_sb: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:34Z INFO 2007 [DMAOptimizationBase]: PSUM Rotation rotated 9197 PSUM Banks 2025-09-05T19:19:34Z INFO 2007 [DMAOptimizationBase]: PSUM Rotation rotated 5583 PSUM Banks 2025-09-05T19:19:35Z INFO 2007 [DMAOptimizationBase]: PSUM Rotation rotated 0 PSUM Banks 2025-09-05T19:19:35Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 2554 Sb address 2025-09-05T19:19:36Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 2329 Sb address 2025-09-05T19:19:36Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 0 Sb address 2025-09-05T19:19:36Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 0 Sb address 2025-09-05T19:19:37Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 59 Sb address 2025-09-05T19:19:37Z INFO 2007 [DMAOptimizationBase]: moved 0 MM forward 2025-09-05T19:19:37Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 0 Sb address 2025-09-05T19:19:38Z INFO 2007 [DMAOptimizationBase]: SB Rotation rotated 0 Sb address 2025-09-05T19:19:38Z USER 2007 [ModuleForkPass]: address_rotation_sb finished after 5.695 seconds 2025-09-05T19:19:38Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2254mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:38Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:38Z USER 2007 [ModuleForkPass]: Running anti_dependency_analyzer 2025-09-05T19:19:38Z INFO 2007 [ModuleForkPass]: Inputs to anti_dependency_analyzer: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:38Z INFO 2007 [AntiDependencyAnalyzer]: Batch size: 1000 2025-09-05T19:19:38Z INFO 2007 [AntiDependencyAnalyzer]: Analysis types: {DRAM,ALIAS,PSUM,SB} 2025-09-05T19:19:38Z INFO 2007 [AntiDependencyAnalyzer]: DRAM size: 17179869184 num-bins: 16 bin-size: 1073741824 2025-09-05T19:19:39Z USER 2007 [ModuleForkPass]: anti_dependency_analyzer finished after 1.349 seconds 2025-09-05T19:19:39Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2285mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:39Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:39Z USER 2007 [ModuleForkPass]: Running anti_dependency_analyzer 2025-09-05T19:19:39Z INFO 2007 [ModuleForkPass]: Inputs to anti_dependency_analyzer: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:39Z INFO 2007 [AntiDependencyAnalyzer]: Batch size: 1000 2025-09-05T19:19:39Z INFO 2007 [AntiDependencyAnalyzer]: Analysis types: {DRAM,ALIAS} 2025-09-05T19:19:39Z INFO 2007 [AntiDependencyAnalyzer]: DRAM size: 17179869184 num-bins: 16 bin-size: 1073741824 2025-09-05T19:19:39Z USER 2007 [ModuleForkPass]: anti_dependency_analyzer finished after 0.188 seconds 2025-09-05T19:19:39Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2287mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:39Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:39Z USER 2007 [ModuleForkPass]: Running dep_opt 2025-09-05T19:19:39Z INFO 2007 [ModuleForkPass]: Inputs to dep_opt: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:39Z INFO 2007 [build_flow_deps]: Start build fdeps. Invocation: 3Fri Sep 5 19:19:39 2025 2025-09-05T19:19:39Z INFO 2007 [build_flow_deps]: Allocs: 69132 instructions: 285701 2025-09-05T19:19:40Z INFO 2007 [build_flow_deps]: Build fdeps inserted 701433 edges 2025-09-05T19:19:40Z INFO 2007 [build_flow_deps]: Done build fdeps 701433 Fri Sep 5 19:19:40 2025 2025-09-05T19:19:41Z USER 2007 [ModuleForkPass]: dep_opt finished after 1.572 seconds 2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2248mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z USER 2007 [ModuleForkPass]: Running report_stats 2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: Inputs to report_stats: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z INFO 2007 [ReportStats]: Data Movement Statistics: sg0000 ┌─────────────┬────────────────────────────┬───────┬────────────┐ │ Instruction │ Kind │ Count │ Bytes │ ├─────────────┼────────────────────────────┼───────┼────────────┤ │ Load │ Const -> Internal │ 1 │ 32768 │ │ Load │ ExternalInput -> Internal │ 7105 │ 7114072064 │ │ Save │ Internal -> ExternalOutput │ 7105 │ 7114072064 │ └─────────────┴────────────────────────────┴───────┴────────────┘ 2025-09-05T19:19:41Z INFO 2007 [ReportStats]: ┌─────────────────────┬───────┐ │ Bytes per partition │ Count │ ├─────────────────────┼───────┤ │ 64 │ 65 │ │ 256 │ 66 │ │ 7168 │ 4096 │ │ 8192 │ 9984 │ └─────────────────────┴───────┘ 2025-09-05T19:19:41Z INFO 2007 [ReportStats]: MM Stats: #MatMults 217153 #MatMult-Transposes 217153 2025-09-05T19:19:41Z INFO 2007 [ReportStats]: IO Tensor size combined: 14496579584 2025-09-05T19:19:41Z INFO 2007 [ReportStats]: IO Tensor Statistics: ┌────────────────────┬────────────────┬──────────┬──────────────┐ │ Largest IO Tensors │ Kind │ Src Type │ Size (Bytes) │ ├────────────────────┼────────────────┼──────────┼──────────────┤ │ output289 │ ExternalOutput │ bfloat16 │ 134217728 │ │ input0 │ ExternalInput │ bfloat16 │ 134217728 │ │ output0 │ ExternalOutput │ bfloat16 │ 134217728 │ │ input289 │ ExternalInput │ bfloat16 │ 134217728 │ │ input7 │ ExternalInput │ bfloat16 │ 58720256 │ │ input9 │ ExternalInput │ bfloat16 │ 58720256 │ │ input24 │ ExternalInput │ bfloat16 │ 58720256 │ │ input18 │ ExternalInput │ bfloat16 │ 58720256 │ │ input15 │ ExternalInput │ bfloat16 │ 58720256 │ │ input27 │ ExternalInput │ bfloat16 │ 58720256 │ └────────────────────┴────────────────┴──────────┴──────────────┘ 2025-09-05T19:19:41Z INFO 2007 [ReportStats]: Large (Internal) Tensor Statistics: ┌────────────────────────────┬──────────┬──────────┬──────────────┐ │ Largest Tensors │ Kind │ Src Type │ Size (Bytes) │ ├────────────────────────────┼──────────┼──────────┼──────────────┤ │ DynamicDMAScratchLoc │ Internal │ uint8 │ 2097152 │ │ t2035_pftranspose_18320_i5 │ Internal │ bfloat16 │ 1048576 │ │ t2035_pftranspose_18320_i2 │ Internal │ bfloat16 │ 1048576 │ │ t2035_pftranspose_18320_i1 │ Internal │ bfloat16 │ 1048576 │ │ t2035_pftranspose_18320_i3 │ Internal │ bfloat16 │ 1048576 │ │ t2035_pftranspose_18320_i4 │ Internal │ bfloat16 │ 1048576 │ │ t2035_pftranspose_18320_i6 │ Internal │ bfloat16 │ 1048576 │ │ t2035_pftranspose_18320_i9 │ Internal │ bfloat16 │ 1048576 │ │ t2035_pftranspose_18320_i8 │ Internal │ bfloat16 │ 1048576 │ │ t2035_pftranspose_18320_i7 │ Internal │ bfloat16 │ 1048576 │ └────────────────────────────┴──────────┴──────────┴──────────────┘ 2025-09-05T19:19:41Z USER 2007 [ModuleForkPass]: report_stats finished after 0.059 seconds 2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2248mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z USER 2007 [BackendPassManager]: mod_parallel_pass finished after 28.058 seconds 2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: curr_vmrss: 2248mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z USER 2007 [BackendPassManager]: Running assign_trigger_engine 2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: Inputs to assign_trigger_engine: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z INFO 2007 [AssignTriggerEngine]: Assigned trigger engine for 0 DMA instructions. Moved 0 DMA instructions to CC's engines. 2025-09-05T19:19:41Z USER 2007 [BackendPassManager]: assign_trigger_engine finished after 0.097 seconds 2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: curr_vmrss: 2248mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z USER 2007 [BackendPassManager]: Running subgraph_parallel_pass 2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: Inputs to subgraph_parallel_pass: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z USER 2007 [SubgraphForkPass]: Running lower_local_collectives 2025-09-05T19:19:41Z INFO 2007 [SubgraphForkPass]: Inputs to lower_local_collectives: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z USER 2007 [SubgraphForkPass]: lower_local_collectives finished after 0.004 seconds 2025-09-05T19:19:41Z INFO 2007 [SubgraphForkPass]: curr_vmrss: 2248mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:41Z INFO 2007 [SubgraphForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z USER 2007 [SubgraphForkPass]: Running extend_shared_lifetimes 2025-09-05T19:19:41Z INFO 2007 [SubgraphForkPass]: Inputs to extend_shared_lifetimes: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z USER 2007 [SubgraphForkPass]: extend_shared_lifetimes finished after 0.003 seconds 2025-09-05T19:19:41Z INFO 2007 [SubgraphForkPass]: curr_vmrss: 2248mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:41Z INFO 2007 [SubgraphForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z USER 2007 [SubgraphForkPass]: Running dead_code_elim 2025-09-05T19:19:41Z INFO 2007 [SubgraphForkPass]: Inputs to dead_code_elim: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z INFO 2007 [DeadCodeElim]: eliminateDeadStore removed 0 instructions 2025-09-05T19:19:41Z USER 2007 [SubgraphForkPass]: dead_code_elim finished after 0.245 seconds 2025-09-05T19:19:41Z INFO 2007 [SubgraphForkPass]: curr_vmrss: 2249mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:41Z INFO 2007 [SubgraphForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z USER 2007 [BackendPassManager]: subgraph_parallel_pass finished after 0.266 seconds 2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: curr_vmrss: 2249mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z USER 2007 [BackendPassManager]: Running assign_hwdge_engine 2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: Inputs to assign_hwdge_engine: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z USER 2007 [BackendPassManager]: assign_hwdge_engine finished after 0.027 seconds 2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: curr_vmrss: 2249mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z USER 2007 [BackendPassManager]: Running mod_parallel_pass 2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: Inputs to mod_parallel_pass: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z USER 2007 [ModuleForkPass]: Running alloc_queues 2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: Inputs to alloc_queues: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z INFO 2007 [AllocQueues]: DMACopy transpose will be triggered from multiple engines 2025-09-05T19:19:41Z INFO 2007 [AllocQueues]: Alloc Queue info: ┌─────────────────┬────────────────┬────────┬────────────┬──────────────────┐ │ Name │ DMAQueue::Type │ Engine │ Num Queues │ Num instructions │ ├─────────────────┼────────────────┼────────┼────────────┼──────────────────┤ │ qSPSpillReload0 │ data │ SP │ 16 │ 1 │ │ qPoolDynamic │ dynamic │ Pool │ 16 │ 14210 │ └─────────────────┴────────────────┴────────┴────────────┴──────────────────┘ 2025-09-05T19:19:41Z USER 2007 [ModuleForkPass]: alloc_queues finished after 0.030 seconds 2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2249mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z USER 2007 [ModuleForkPass]: Running chain_dma_transposes 2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: Inputs to chain_dma_transposes: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z USER 2007 [ModuleForkPass]: chain_dma_transposes finished after 0.003 seconds 2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2249mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z USER 2007 [ModuleForkPass]: Running prefetch_scheduling_after_sched 2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: Inputs to prefetch_scheduling_after_sched: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z USER 2007 [ModuleForkPass]: prefetch_scheduling_after_sched finished after 0.003 seconds 2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2249mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z USER 2007 [ModuleForkPass]: Running lower_control 2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: Inputs to lower_control: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z INFO 2007 [LowerControl]: EraseInterBbDeps removed 0 inter-BB deps 2025-09-05T19:19:41Z USER 2007 [ModuleForkPass]: lower_control finished after 0.197 seconds 2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2249mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:41Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z USER 2007 [BackendPassManager]: mod_parallel_pass finished after 0.252 seconds 2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: curr_vmrss: 2249mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z USER 2007 [BackendPassManager]: Running nc_parallel_pass 2025-09-05T19:19:41Z INFO 2007 [BackendPassManager]: Inputs to nc_parallel_pass: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z USER 2007 [CoreForkPass]: Running dep_reduction 2025-09-05T19:19:41Z INFO 2007 [CoreForkPass]: Inputs to dep_reduction: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:41Z INFO 2007 [DepReduction]: Start Dependency Reduction 2025-09-05T19:19:42Z INFO 2007 [DepReduction]: Processing async instrs... 2025-09-05T19:19:42Z INFO 2007 [DepReduction]: Processing secondary edges per engine... 2025-09-05T19:19:42Z INFO 2007 [DepReduction]: Processing secondary edges per engine, Done. Num edges removed 484672 2025-09-05T19:19:42Z INFO 2007 [DepReduction]: Processing redundant descendants, Done. Num edges removed 497220 2025-09-05T19:19:42Z INFO 2007 [DepReduction]: Processing async instrs, Done. Num edges removed 497220 2025-09-05T19:19:43Z INFO 2007 [DepReduction]: Num Async removed: 0 2025-09-05T19:19:43Z INFO 2007 [DepReduction]: Finished dependency reduction: 1173793 removed, new total 115343 2025-09-05T19:19:43Z INFO 2007 [DepReduction]: Finished Dependency Reduction 2025-09-05T19:19:43Z USER 2007 [CoreForkPass]: dep_reduction finished after 1.714 seconds 2025-09-05T19:19:43Z INFO 2007 [CoreForkPass]: curr_vmrss: 2263mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:43Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:43Z USER 2007 [CoreForkPass]: Running lower_dynamic_dma 2025-09-05T19:19:43Z INFO 2007 [CoreForkPass]: Inputs to lower_dynamic_dma: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:43Z USER 2007 [CoreForkPass]: lower_dynamic_dma finished after 0.048 seconds 2025-09-05T19:19:43Z INFO 2007 [CoreForkPass]: curr_vmrss: 2263mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:43Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:43Z USER 2007 [CoreForkPass]: Running legalize_dynamic_dma 2025-09-05T19:19:43Z INFO 2007 [CoreForkPass]: Inputs to legalize_dynamic_dma: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:43Z INFO 2007 [LegalizeDynamicDMA]: Legalize Dynamic DMA scanned 0 DGE instructions 2025-09-05T19:19:43Z INFO 2007 [LegalizeDynamicDMA]: After Legalize Dynamic DMA, 0 DGE instructions were scanned 2025-09-05T19:19:43Z INFO 2007 [LegalizeDynamicDMA]: ┌───────────┬───────────────────────────────┬────────────────────────────┐ │ Sub-Pass │ Illegal Instructions Detected │ New Instructions Generated │ ├───────────┼───────────────────────────────┼────────────────────────────┤ │ Peeling │ 0 │ 0 │ │ Unrolling │ 0 │ 0 │ │ Splitting │ 0 │ 0 │ └───────────┴───────────────────────────────┴────────────────────────────┘ 2025-09-05T19:19:43Z USER 2007 [CoreForkPass]: legalize_dynamic_dma finished after 0.100 seconds 2025-09-05T19:19:43Z INFO 2007 [CoreForkPass]: curr_vmrss: 2263mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:43Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285701 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:43Z USER 2007 [CoreForkPass]: Running lower_dma 2025-09-05T19:19:43Z INFO 2007 [CoreForkPass]: Inputs to lower_dma: modules=1 functions=1 allocs=69132 blocks=1 instructions=285701 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:43Z INFO 2007 [LowerDMA]: lower_dma metrics start IO Copy (DGE/DMA) 128 partition : 14145/14145 (100% DGE) power-of-2 partition : 14210/14210 (100% DGE) > 3 dimensional : 0/0 non-integer desc size : 0/0 total : 14210/14210 (100% DGE) Cast (DGE/DMA) 128 partition : 0/0 power-of-2 partition : 0/0 > 3 dimensional : 0/0 non-integer desc size : 0/0 total : 0/0 Spill/Reload Copy (DGE/DMA) 128 partition : 0/1 (0% DGE) power-of-2 partition : 0/1 (0% DGE) > 3 dimensional : 0/0 non-integer desc size : 0/0 total : 0/1 (0% DGE) Cast (DGE/DMA) 128 partition : 0/0 power-of-2 partition : 0/0 > 3 dimensional : 0/0 non-integer desc size : 0/0 total : 0/0 CopyMode CCE : 0 Transpose : 0 Replicate : 0 Dynamic (DGE/DMA) scalar : 0/0 vector : 0/0 Opcode ReadVarAddr : 0 IndirectLoad : 0 IndirectSave : 0 IndirectSaveAccumulate : 0 DstReduceDGE : 0 lower_dma metrics end 2025-09-05T19:19:43Z USER 2007 [CoreForkPass]: lower_dma finished after 0.145 seconds 2025-09-05T19:19:43Z INFO 2007 [CoreForkPass]: curr_vmrss: 2263mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:43Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285710 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:43Z USER 2007 [CoreForkPass]: Running coalesce_dma_blocks 2025-09-05T19:19:43Z INFO 2007 [CoreForkPass]: Inputs to coalesce_dma_blocks: modules=1 functions=1 allocs=69132 blocks=1 instructions=285710 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:44Z INFO 2007 [CoalesceDmaBlocks]: Coaleseced 0 DMA triggers 2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: coalesce_dma_blocks finished after 0.133 seconds 2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: curr_vmrss: 2264mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285710 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: Running expand_all_engine 2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Inputs to expand_all_engine: modules=1 functions=1 allocs=69132 blocks=1 instructions=285710 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: expand_all_engine finished after 0.050 seconds 2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: curr_vmrss: 2264mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285710 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: Running alloc_semaphores 2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Inputs to alloc_semaphores: modules=1 functions=1 allocs=69132 blocks=1 instructions=285710 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: alloc_semaphores finished after 0.242 seconds 2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: curr_vmrss: 2264mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285710 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: Running expand_inst_late 2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Inputs to expand_inst_late: modules=1 functions=1 allocs=69132 blocks=1 instructions=285710 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: expand_inst_late finished after 0.262 seconds 2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: curr_vmrss: 2264mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285710 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: Running seq_inst_opt 2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Inputs to seq_inst_opt: modules=1 functions=1 allocs=69132 blocks=1 instructions=285710 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:44Z INFO 2007 [SeqInstOpt]: Removing 0 unnecessary InstRegisterMove instruction(s) from Block1 2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: seq_inst_opt finished after 0.027 seconds 2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: curr_vmrss: 2264mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 285710 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: Running lower_sync 2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Inputs to lower_sync: modules=1 functions=1 allocs=69132 blocks=1 instructions=285710 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: lower_sync finished after 0.102 seconds 2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: curr_vmrss: 2264mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300967 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: Running lower_act 2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Inputs to lower_act: modules=1 functions=1 allocs=69132 blocks=1 instructions=300967 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: lower_act finished after 0.036 seconds 2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: curr_vmrss: 2264mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300968 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:44Z USER 2007 [CoreForkPass]: Running lower_dve 2025-09-05T19:19:44Z INFO 2007 [CoreForkPass]: Inputs to lower_dve: modules=1 functions=1 allocs=69132 blocks=1 instructions=300968 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:44Z INFO 2007 [LowerDVE]: Loading DVE opcodes table dve_info.json from /opt/conda/lib/python3.10/site-packages/neuronxcc/dve/dve_bin_gen2/dve_info.json 2025-09-05T19:19:45Z USER 2007 [CoreForkPass]: lower_dve finished after 0.303 seconds 2025-09-05T19:19:45Z INFO 2007 [CoreForkPass]: curr_vmrss: 2296mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:45Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300968 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:45Z USER 2007 [CoreForkPass]: Running lower_ap 2025-09-05T19:19:45Z INFO 2007 [CoreForkPass]: Inputs to lower_ap: modules=1 functions=1 allocs=69132 blocks=1 instructions=300968 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:45Z USER 2007 [CoreForkPass]: lower_ap finished after 0.066 seconds 2025-09-05T19:19:45Z INFO 2007 [CoreForkPass]: curr_vmrss: 2177mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:45Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300968 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:45Z USER 2007 [CoreForkPass]: Running coloring_allocator_reg 2025-09-05T19:19:45Z INFO 2007 [CoreForkPass]: Inputs to coloring_allocator_reg: modules=1 functions=1 allocs=69132 blocks=1 instructions=300968 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:45Z INFO 2007 [ColoringAllocator::Rep]: Allocating functions 2025-09-05T19:19:45Z INFO 2007 [ColoringAllocator::Rep]: linearize and check 2025-09-05T19:19:45Z INFO 2007 [REG_Allocator]: allocating REG 2025-09-05T19:19:45Z INFO 2007 [REG_Allocator]: main loop iteration 1 2025-09-05T19:19:45Z USER 2007 [CoreForkPass]: coloring_allocator_reg finished after 0.044 seconds 2025-09-05T19:19:45Z INFO 2007 [CoreForkPass]: curr_vmrss: 2190mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:45Z INFO 2007 [CoreForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300968 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:45Z USER 2007 [BackendPassManager]: nc_parallel_pass finished after 3.389 seconds 2025-09-05T19:19:45Z INFO 2007 [BackendPassManager]: curr_vmrss: 2190mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:45Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300968 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:45Z USER 2007 [BackendPassManager]: Running mod_parallel_pass 2025-09-05T19:19:45Z INFO 2007 [BackendPassManager]: Inputs to mod_parallel_pass: modules=1 functions=1 allocs=69132 blocks=1 instructions=300968 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:45Z USER 2007 [ModuleForkPass]: Running birverifier 2025-09-05T19:19:45Z INFO 2007 [ModuleForkPass]: Inputs to birverifier: modules=1 functions=1 allocs=69132 blocks=1 instructions=300968 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:45Z USER 2007 [ModuleForkPass]: birverifier finished after 0.254 seconds 2025-09-05T19:19:45Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2190mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:45Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300968 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:45Z USER 2007 [BackendPassManager]: mod_parallel_pass finished after 0.261 seconds 2025-09-05T19:19:45Z INFO 2007 [BackendPassManager]: curr_vmrss: 2190mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:45Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300968 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:45Z USER 2007 [BackendPassManager]: Running subgraph_parallel_pass 2025-09-05T19:19:45Z INFO 2007 [BackendPassManager]: Inputs to subgraph_parallel_pass: modules=1 functions=1 allocs=69132 blocks=1 instructions=300968 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:45Z USER 2007 [SubgraphForkPass]: Running lnc_verifier 2025-09-05T19:19:45Z INFO 2007 [SubgraphForkPass]: Inputs to lnc_verifier: modules=1 functions=1 allocs=69132 blocks=1 instructions=300968 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:45Z USER 2007 [SubgraphForkPass]: lnc_verifier finished after 0.003 seconds 2025-09-05T19:19:45Z INFO 2007 [SubgraphForkPass]: curr_vmrss: 2190mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:45Z INFO 2007 [SubgraphForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300968 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:45Z USER 2007 [BackendPassManager]: subgraph_parallel_pass finished after 0.010 seconds 2025-09-05T19:19:45Z INFO 2007 [BackendPassManager]: curr_vmrss: 2190mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:45Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300968 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:45Z USER 2007 [BackendPassManager]: Running mod_parallel_pass 2025-09-05T19:19:45Z INFO 2007 [BackendPassManager]: Inputs to mod_parallel_pass: modules=1 functions=1 allocs=69132 blocks=1 instructions=300968 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:45Z USER 2007 [ModuleForkPass]: Running codegen 2025-09-05T19:19:45Z INFO 2007 [ModuleForkPass]: Inputs to codegen: modules=1 functions=1 allocs=69132 blocks=1 instructions=300968 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:45Z INFO 2007 [Codegen]: Total compiler allocated DRAM tensors: 0 GB 2025-09-05T19:19:45Z INFO 2007 [Codegen]: Total un-allocated DRAM tensors by kind: 2025-09-05T19:19:45Z INFO 2007 [Codegen]: ┌───────────────┬─────────────┐ │ TensorKind │ Size (GB) │ ├───────────────┼─────────────┤ │ ExternalInput │ 6.7505 │ │ Const │ 3.05176e-05 │ └───────────────┴─────────────┘ 2025-09-05T19:19:45Z INFO 2007 [Codegen]: Total runtime managed DRAM tensors: 6.75053 GB 2025-09-05T19:19:46Z INFO 2007 [Codegen]: Instruction Stats: 2025-09-05T19:19:46Z INFO 2007 [Codegen]: ┌─────────────────────┬────────┐ │ Opcode │ Count │ ├─────────────────────┼────────┤ │ LDWEIGHTS │ 217153 │ │ MATMUL │ 217153 │ │ ACTIVATE │ 54337 │ │ EVENT_SEMAPHORE │ 15257 │ │ UNKNOWN(0xd4) │ 14210 │ │ NOP │ 8 │ │ PSEUDO_BRANCH_LABEL │ 5 │ │ ACT_TABLE_LOAD │ 1 │ │ PSEUDO_DMA_TRIGGER │ 1 │ └─────────────────────┴────────┘ 2025-09-05T19:19:46Z INFO 2007 [Codegen]: ┌────────────┬────────┐ │ Engine │ Count │ ├────────────┼────────┤ │ Unassigned │ 0 │ │ GPSIMD │ 20924 │ │ Scalar │ 56947 │ │ Tensor │ 440254 │ │ SyncDMA │ 0 │ │ Vector │ 2 │ │ Sync │ 3 │ │ All │ 0 │ └────────────┴────────┘ 2025-09-05T19:19:46Z INFO 2007 [Codegen]: Total instructions: 518130 (0.030883 GB) 2025-09-05T19:19:46Z INFO 2007 [Codegen]: Total DynamicDMA instruction count: 14210 2025-09-05T19:19:46Z USER 2007 [Codegen]: isa_gen finished after 1.105 seconds 2025-09-05T19:19:46Z INFO 2007 [Codegen]: Number of DMA descriptors on each queue instance: ┌─────────────────┬────────────────┐ │ Queue Instance │ RT Descriptors │ ├─────────────────┼────────────────┤ │ qSPSpillReload0 │ 256 │ └─────────────────┴────────────────┘ Total descriptors: 256 (3.8147e-06 GB) 2025-09-05T19:19:46Z INFO 2007 [Codegen]: Number of DMA engines used by each queue: ┌─────────────────┬─────────────────────┐ │ Queue │ DMA Engines │ ├─────────────────┼─────────────────────┤ │ qSPSpillReload0 │ 16 │ │ qPoolDynamic │ 16 │ ├─────────────────┼─────────────────────┤ │ TOTAL │ 32 (must be <= 176) │ └─────────────────┴─────────────────────┘ 2025-09-05T19:19:46Z INFO 2007 [Codegen]: Tensors with largest descriptor count: ┌──────────────────────┬──────────┬──────────┬──────────────────┐ │ Tensor Name │ Kind │ Src Type │ Descriptor Count │ ├──────────────────────┼──────────┼──────────┼──────────────────┤ │ identity_local_22027 │ Internal │ bfloat16 │ 1 │ │ identity_22025 │ Const │ bfloat16 │ 1 │ └──────────────────────┴──────────┴──────────┴──────────────────┘ 2025-09-05T19:19:46Z USER 2007 [Codegen]: dma_desc_gen finished after 0.004 seconds 2025-09-05T19:19:46Z INFO 2007 [Codegen]: Estimated peak DRAM usage: 6.78141 GB 2025-09-05T19:19:46Z INFO 2007 [Codegen]: Generating debug info 2025-09-05T19:19:47Z USER 2007 [Codegen]: debug_info_gen finished after 0.568 seconds 2025-09-05T19:19:47Z USER 2007 [ModuleForkPass]: codegen finished after 1.742 seconds 2025-09-05T19:19:47Z INFO 2007 [ModuleForkPass]: curr_vmrss: 2385mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:47Z INFO 2007 [ModuleForkPass]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300968 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:47Z USER 2007 [BackendPassManager]: mod_parallel_pass finished after 1.775 seconds 2025-09-05T19:19:47Z INFO 2007 [BackendPassManager]: curr_vmrss: 2204mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:47Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300968 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:47Z USER 2007 [BackendPassManager]: Running neff_packager 2025-09-05T19:19:47Z INFO 2007 [BackendPassManager]: Inputs to neff_packager: modules=1 functions=1 allocs=69132 blocks=1 instructions=300968 Max writers: 128 Max Readers: 217153 2025-09-05T19:19:47Z WARNING 2007 [NeffFileWriter]: writeKelp missing file /local/p4clients/pkgbuild-const/workspace/build/KaenaCompiler/KaenaCompiler-2.x.169490.0/AL2_x86_64/DEV.STD.PTHREAD/build/private/_skbuild/linux-x86_64-3.10/cmake-build/neuronxcc/walrus/neff_packager/MetricMetadata.json 2025-09-05T19:19:47Z INFO 2007 [NeffFileWriter]: Neff will be written to: /models/mistral-7b-v0.3-instruct-neuronx/layout_opt/graph.neff 2025-09-05T19:19:47Z INFO 2007 [NeffFileWriter]: IR signature: b43fb3a95505587f066b6d2d419d7150 for neff artifacts 2025-09-05T19:19:47Z USER 2007 [BackendPassManager]: neff_packager finished after 0.299 seconds 2025-09-05T19:19:47Z INFO 2007 [BackendPassManager]: curr_vmrss: 2205mb, ru_maxrss: 2541mb (delta=0mb) 2025-09-05T19:19:47Z INFO 2007 [BackendPassManager]: Output has 1 module(s), 1 function(s), 69132 memory location(s), 1 block(s), and 300968 instruction(s). Max writers: 128 Max Readers: 217153 2025-09-05T19:19:47Z INFO 2007 [BackendDriver]: HBM scratchpad usage summary (post-allocation): ┌──────┬───────────┬────────────────────────────────────────────────────────────┬─────────────┐ │ Core │ Subgraph │ Description │ Value │ ├──────┼───────────┼────────────────────────────────────────────────────────────┼─────────────┤ │ nc00 │ module │ Peak scratchpad usage: local │ 0.000000 GB │ │ nc00 │ module │ Total size of allocated tensors: local │ 0.000000 GB │ │ nc00 │ Max │ Peak scratchpad usage: local │ 0.000000 GB │ │ nc00 │ Post-link │ Peak scratchpad usage after intermediate tensor allocation │ 0.000000 GB │ │ nc00 │ Post-link │ Total size of allocated intermediate tensors │ 0.000000 GB │ ├──────┼───────────┼────────────────────────────────────────────────────────────┼─────────────┤ │ Max │ Max │ Peak scratchpad usage │ 0.000000 GB │ │ Max │ Max │ Peak scratchpad usage (page-aligned) │ 0.000000 GB │ └──────┴───────────┴────────────────────────────────────────────────────────────┴─────────────┘ 2025-09-05T19:19:47Z INFO 2007 [BackendDriver]: Backend completed successfully, tearing down. 2025-09-05T19:19:48Z INFO 1971 [job.WalrusDriver.0]: Job #0 finished 2025-09-05T19:19:48Z INFO 1971 [pipeline.Pipeline.0]: Finished job job.WalrusDriver.0 2025-09-05T19:19:48Z INFO 1971 [pipeline.Pipeline.0]: Starting job job.BIRLinker.0 2025-09-05T19:19:48Z INFO 1971 [job.BIRLinker.0]: Replay this job by calling: /opt/conda/bin/neuronx-cc compile --framework XLA --state '{"model": ["/models/mistral-7b-v0.3-instruct-neuronx/layout_opt/model/graph.hlo"], "tensormap": "tensor_map.json", "bir": "bir.json", "lorean_sg_key": null, "input_name_map": null, "output_name_map": null, "constant_tensors": null, "state_dir": "/opt/vllm/neuronxcc-gss0wi_w/sg00", "state_id": "sg00"}' --pipeline BIRLinker 2025-09-05T19:19:48Z INFO 1971 [job.BIRLinker.0]: BIRLinker cwd: /opt/vllm/neuronxcc-gss0wi_w 2025-09-05T19:19:48Z INFO 1971 [job.BIRLinker.0]: Linking not needed. Netlist doesnt exist 2025-09-05T19:19:48Z INFO 1971 [pipeline.Pipeline.0]: Finished job job.BIRLinker.0 2025-09-05T19:19:48Z INFO 1971 [pipeline.Pipeline.0]: Starting job job.Kelper.0 2025-09-05T19:19:48Z INFO 1971 [job.Kelper.0]: Skipping neff generation which was already performed by neff_packager 2025-09-05T19:19:48Z INFO 1971 [pipeline.Pipeline.0]: Finished job job.Kelper.0 2025-09-05T19:19:48Z INFO 1971 [pipeline.Pipeline.0]: Starting job job.NeffWrapper.0 2025-09-05T19:19:48Z INFO 1971 [job.NeffWrapper.0]: Job NeffWrapper len(in_states) 1 2025-09-05T19:19:48Z INFO 1971 [job.NeffWrapper.0]: Processing input #0 2025-09-05T19:19:48Z INFO 1971 [job.NeffWrapper.0]: Start NeffWrapper 2025-09-05T19:19:48Z INFO 1971 [job.NeffWrapper.0]: Executing: /opt/conda/lib/python3.10/site-packages/neuronxcc/starfish/bin/hlo-neff-wrapper --hlo /models/mistral-7b-v0.3-instruct-neuronx/layout_opt/model/graph.hlo --neff /models/mistral-7b-v0.3-instruct-neuronx/layout_opt/graph.neff --io_transposes /opt/vllm/neuronxcc-gss0wi_w/io_transposes.json --output /models/mistral-7b-v0.3-instruct-neuronx/layout_opt/wrapped_neff.hlo --netlist /opt/vllm/neuronxcc-gss0wi_w/hlo_netlist.json 2025-09-05T19:19:48Z INFO 1971 [job.NeffWrapper.0]: Could not open file: /opt/vllm/neuronxcc-gss0wi_w/hlo_netlist.json There are no io transposes nor zero-sized parameters. Output will not be produced. Hlo neff wrapper finished successfully. Have a wonderful day :D 2025-09-05T19:19:48Z INFO 1971 [job.NeffWrapper.0]: Job #0 finished 2025-09-05T19:19:48Z INFO 1971 [pipeline.Pipeline.0]: Finished job job.NeffWrapper.0 2025-09-05T19:19:48Z INFO 1971 [pipeline.Pipeline.0]: Finished pipeline Pipeline 2025-09-05T19:19:48Z INFO 1971 [pipeline.Pipeline.0]: Job #0 finished 2025-09-05T19:19:48Z INFO 1939 [root]: Subcommand returned with exitcode=0