Description
Hi All. I'm hoping that someone can explain to me what is going on with nanoGPT and my RTX 4090.
I had a Win11 workstation with a GTX 1080 running an extremely small nanoGPT generating N number of new tokens. I set the seed to a constant value before each generate call. The N tokens were returned in ~7-8 secs every time. This was very consistent and the results were re-producible. All was good. The GTX 1080 ran around 35% loading.
I upgraded to a new Win11 workstation with an RTX 4090 and migrated the same code, same data, and same model. The generate time for the RTX will now vary from 3 secs to 40 secs, and the GPU is running at around 96%.
Can anyone explain to me why this is occurring? The results are good, but the variance in generate time is driving me crazy.
Thanks in advance,
Art
Activity