Qwen/Qwen3-Next-80B-A3B-Instruct

请问一下，使用megatron微调Qwen3-Next时，设置--target_modules为"all-linear"能否训练到Qwen3NextGatedDeltaNet部分？

👀 1

#41 opened 2 days ago by

alanayu

Add Artificial Analysis evaluations for qwen3-next-80b-a3b-instruct

#40 opened 6 days ago by

mackenzietechdocs

Will there be a "VL" version of Qwen3-Next been released in the future?

#39 opened 9 days ago by

banne2266

Problems with inference

1

#38 opened 22 days ago by

Kirill200223

Issues with Fine Tuning

👍 2

1

#37 opened about 2 months ago by

rirv938

Has anybody got MTP working on VLLM? ('GPUModelRunner' object has no attribute 'drafter')

#36 opened about 2 months ago by

stev236

Generates nonsense when run with latest VLLM with Flashinfer 0.4

#35 opened about 2 months ago by

stev236

Bug: Running the example gives nonsensical response on 8xH100

#33 opened 2 months ago by

kz919

return null

#32 opened 2 months ago by

sakuramiko35

How much Vram needed for the full context length?

6

#31 opened 3 months ago by

Aly87

求大神解读一下这行代码的含义

#30 opened 3 months ago by

bluelueSea

Int4 quantization broken

3

#28 opened 3 months ago by

TheBigBlockPC

Could you release a 20B‑scale MoE version? Thank you very much.

🔥 1

1

#27 opened 3 months ago by

houxiaowei

Awesome! Please be sure to train a 80B A3B next version coder model!

🔥 10

#26 opened 3 months ago by

wukongai

Bug report with running with transformers

#25 opened 3 months ago by

qsstcl

Only 2k max-tokens in lm-studio?

#24 opened 3 months ago by

jkkit

qwen

#23 opened 3 months ago by

Dumpy13

Test

#22 opened 3 months ago by

vhm8356

VRAM requirement for maximum token length?

🚀 5

#21 opened 3 months ago by

Donhuay

guide for runing this at 12gbvram and 180gb ram with dual cpu in vllm 0.5 to 0.6t/sec in vllm

🔥 👍 4

2

#20 opened 3 months ago by

gopi87

Fix broken qwen3-next blog link

#19 opened 3 months ago by

Smorty100

FP8 please

👀 ➕ 16

8

#18 opened 3 months ago by

aliquis-pe

model_use

#17 opened 3 months ago by

mohanpichikala

Will smaller Qwen3-Next models be released in the future?

➕ 👀 7

1

#15 opened 3 months ago by

ZAID041

abhai

#14 opened 3 months ago by

Abhai121

🚀 Best Practices for Evaluating the Qwen3-Next Model

🚀 👍 8

#13 opened 3 months ago by

Yunxz

Is it possible to finetune with ms-swift?

🚀 1

3

#12 opened 3 months ago by

phosira

reduced multi language quality

👍 1

3

#11 opened 3 months ago by

rastegar

遥遥领先了

2

#10 opened 3 months ago by

OrlandoHugBot

用readme的代码测试，返回乱码

5

#9 opened 3 months ago by

tarjintor

Plan for AWQ?

➕ 26

3

#8 opened 3 months ago by

hyunw55

How much GPU memory is needed for local deployment?

13

#7 opened 3 months ago by

XuehangCang

fix the blog link

1

#6 opened 3 months ago by

ryan-u

Will there be dedicated technical report for Qwen3-Next?

👍 6

#5 opened 3 months ago by

Gmc2

The model is wholesome

🔥 1

2

#4 opened 3 months ago by deleted

Local Installation Video and Testing On CPU - Step by Step

🤗 3

#3 opened 3 months ago by

fahdmirzac

No base model

👍 15

8

#2 opened 3 months ago by

ricardo-rei

GGUF when? 8 bit quant when?

➕ ❤️ 13

14

#1 opened 3 months ago by

ouchiewouchie