The max new token is set as 4096 in watsonx.ai for the mixtral model, but the model will support context window up to 32768 tokens (input + output). So when output is set as 4096 then you have 32768 - 4096 = 28672, which is the max input the model will accept.
Hope it helps!
Original Message:
Sent: Thu February 29, 2024 01:34 AM
From: Jing Zhang
Subject: max new token limit of mixtral-8x7b-instruct-v01-q model in watsonx
Hi, Catherine,
Thanks for your reply. But we are already on the paid plan, but the restriction of 4096 tokens still exist. Do we need to upgrading the paid plan?
------------------------------
Jing Zhang
Original Message:
Sent: Wed February 28, 2024 02:52 PM
From: Catherine CAO
Subject: max new token limit of mixtral-8x7b-instruct-v01-q model in watsonx
Hi, Jing,
If you are on a free Lite plan, the max token is limited to 4096 tokens. When on a paid plan, that restriction will be removed. Hope it helps!
------------------------------
Catherine CAO
Original Message:
Sent: Mon February 26, 2024 02:57 AM
From: Jing Zhang
Subject: max new token limit of mixtral-8x7b-instruct-v01-q model in watsonx
I noticed that max new token limit of mixtral-8x7b-instruct-v01-q model in watsonx ai is 32768, but when I test this model in watsonx ai prompt lab, If I set max new token is 32767, I will get the error. and I just find the max new token limit in prompt lab is 4096. So I'm very confused about this
------------------------------
Jing Zhang
------------------------------
#AIandDSSkills