Download Ryzen AI/LLM Software for Desktop – Windows And Linux

By | September 27, 2025

Running LLMs locally using an AMD GPU (RX 7900 XTX and RX 9070 XT etc.) can be done using the AMD funded open source software called Lemonade that you can download on your Windows and Linux machines. Lemonade is local LLM desktop server software that has components and runtime that allows access to NPU on supported CPU and Graphics for local LLM inference and has both GUI and command line interfaces. Check out below to see How you can download the GUI version of installation of Lemonade and run LLMs without sending data to cloud.

AMD local llm runner interface

Download Lemonade exe (AMD LLM software) for Windows 10

Download site

Lemonade_Server_Installer exe link

AMD AI Software lemonade
lemonade server chat windows

Currently Trending LLM models for local use

Model NameRepositoryDescriptionDownload Link
Qwen3-30B-A3B-Instruct-2507-GGUFunsloth/Qwen3-30B-A3B-Instruct-2507-GGUFUpdated version of Qwen3-30B-A3B non-thinking mode with enhancements in instruction following, logical reasoning, text comprehension, mathematics, science, coding, and tool usage. Supports up to 262144 context length.Hugging Face
Qwen3-Coder-30B-A3B-Instruct-GGUFunsloth/Qwen3-Coder-30B-A3B-Instruct-GGUFCoding-focused variant of Qwen3-30B-A3B-Instruct, optimized for programming tasks with support for long outputs up to 65,536 tokens. Features improved reasoning and tool usage.Hugging Face
gpt-oss-120b-GGUFunsloth/gpt-oss-120b-GGUFOpen-weight Mixture-of-Experts (MoE) model from OpenAI with 117B total parameters (5.1B active per token), excelling in coding, math, health, and agentic tool use. Apache 2.0 licensed.Hugging Face
gpt-oss-20b-GGUFunsloth/gpt-oss-20b-GGUFSmaller open-weight MoE model from OpenAI with 21B total parameters (3.6B active per token), suitable for fine-tuning on consumer hardware. Strong performance in reasoning tasks.Hugging Face
GLM-4.5-Air-UD-Q4K-XL-GGUFunsloth/GLM-4.5-Air-GGUFQuantized (UD-Q4_K_XL) version of GLM-4.5-Air, a high-performance MoE model praised for fast tool calls, reasoning, and agentic capabilities. Supports up to 32K context.Hugging Face

Alternatives to Ryzen AI/LLM Software :

If Lemonade is not your cup of tea 😀 then we have some other recommendations that work with both Nvidia and AMD graphics card and are quite popular too:

Jan Ai – This one I use currently and it is open source like Lemonade

LM Studio – Closed source, another brilliant alternative

Jan AIOpen-source ChatGPT alternative that runs 100% offline on your computer, supporting local LLMs like those from Hugging Face.Official Download
LM StudioDesktop app to discover, download, and run local LLMs (e.g., Llama, Gemma, Qwen) privately on your machine.Official Download

Lemonade Intro Video

Lemonade software supported AMD GPU Series

GPU SeriesArchitectureKey ExamplesNotes
Radeon 7000 SeriesRDNA 3RX 7900 XTX, RX 7800 XTDiscrete GPUs, strong Vulkan support for high-end LLM inference, hybrid capable with Ryzen AI.
Radeon 9000 SeriesRDNA 4RX 9070 XT, RX 9900 XTXLatest gen (2024–2025), enhanced AI/RT features; primary focus for new deployments.
Ryzen AI Integrated GPUs (7000/8000/300 Series)RDNA 2/3Ryzen 7 7840HS iGPU, Ryzen AI 9 HX 370 iGPUIntegrated in APUs, Vulkan-accelerated,essential for hybrid NPU+iGPU on laptops/desktops.

Older AMD GPU with Vulkan drivers (e.g. RX 6000 series on RDNA 2) can also work, but performance is best on the above. ROCm support (for Linux) targets similar modern hardware.

Leave a Reply